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ELECTRONIC EMPLOYEE SELECTION SYSTEMS AND METHODS 
ABSTRACT 

S An automated employee selection system can use a variety of techniques to 

provide infonnation for assisting in selection of employees. For example, pre-hire and 
post-hire information can be collected electronically and used to build an artificial- 
intelligence based model. The model can then be used to predict a desired job 
performance criterion (e.g., tenure, number of accidents, sales level, or the like) for new 

10 applicants. A wide variety of features can be supported, such as electronic reporting. 
Pre-hire mformation identified as ineffective can be removed firom a collected pre-hire 
information. For example, ineffective questions can be identified and removed finom a . 
job application. New items can be added and their effectiveness tested. As a result, a 
system can exhibit adaptive learning and maintain or increase effectiveness even under 

1 5 changing conditions. 
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ELECTRONIC EMPLOYEE SELECTION SYSTEMS AND METHODS 

TECHNICAL FIELD 
5 The invention relates to automated employee selection. 

COPYRIGHT AUTHORIZATION 

A portion of the disclosure of this patent document contains material that is 
subject to copyright protection. The copyright owner has no objection to the facsimile 
10 reproduction by anyone of the patent document or the patent disclosure, as it appears in 
the Patent and Trademark Office patent file or records, but otherwise reserves all 
copyright rights whatsoever. 

BACKGROUND 

Organizations can spend considerable time and effort identifying and hiring 
15 suitable employees. Good help is hard to find. Despite their best efforts, organizations 
still often meet with £adlure and simply accept high turnover and poor employee 
performance. 

A variety of approaches to finding and hiring employees have been tried. A 
well-known tool for employee selection is the job application. Job applications help 
20 identify a job applicant's qualifications, such as educational background, job history, 
skills, and experience. 

An employer typically collects a set of job applications fibom applicants who 
drop by an employer work site or appear at a job fair. Someone in the organization then 
reviews the apphcations to deteranne which apphcants merit fiirther investigation. 
25 Then, a job interview, a test, or some other review process is sorbetimes used to further 
limit the applicant pool. 

With the advent of the electronic age, job applications can be completed 
electronically. In this way, the delays associated with processiag paper can be 
minimized. However, even electronically-completed job applications can be of 
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questionable merit and still reqijire considerable effort on the part of the hiring 
organization to review them. A better way of selecting employees is still needed. 

SUMMARY 

5 Large organizations can bring considerable resources to bear on the task of 

developing a job application. For example, a large retail chain might consult with an 
industrial psychologist to study the job environment and develop a set of questions that 
ostensibly predict whether an individual will excel in the enviroimient 

However, such an approach is firanght with inaccuracy and subjectivity; further, 
10 the psychologist's analysis depends on conditions that may change over time. For 

exaniple, even if the psychologist identifies ^propriate factors for testing, an applicant 
might slant answers on the application based on what the applicant perceives is 
expected. Further, two psychologists might come up with two completely different sets 
of factors. And, finally, as the job conditions and applicant pool changes over time, the 
1 5 factors may become less effective or ineffective. 

To determine whether a job application is effective, a study can be conducted to 
verify whether the factors chosen by the psychologist have been successful in 
identifying suitable applicants. However, such a study requires even more effort in 
addition to the considerable effort already invested in developing the application. So, 
20 such a study typically is not conducted until managers in the organization already know 
that the application is ineffective or out of date. 

The disclosed embodiments include various systems and methods related to 
automated enq)loyee selection. For example, various techniques can be used to 
automate the job application and employee selection process. 
25 In one aspect of an embodiment, answers to job application questions can be 

collected directly from the applicant via an electronic device. Based on correlations of 
the answers with answers to questions by other individuals for which post-hire 
information has been collected, a post-hire outcome is predicted. 
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In another aspect of an embodiment, an artificial-intelligence technique is used 
For example, a neural network or a fuzzy logic sj^tem can be used to build a model that 
predicts a post-hire outcome. Proposed models of different types can be constructed and 
tested to identify a superior model. 
S When constructing a model, an information-theory-based feature selection 

technique can be used to reduce the number of inputs, thereby facilitating more ejficient 
model construction. 

Items identified as ineffective predictors can be removed &om the job 
application. Information collected based on the new job application can be used to build 
10 a refined model. In this way, a system can exhibit adq)tive learning and maintain its 
effectiveness even if conditions change over time. Content can be rotated or otherwise 
modified so the job application changes and maintains its effectiveness over time. 
Evolution toward higher predictive accuracy for employee selection can be achieved. 

A sample size monitor can identify when sufficient information has been 
15 collected electronically to build a refined model, hi this way, short-cycle criterion 
validation and performance-driven item rotation can be supported. 

Outcomes can be predicted for any of a wide variety of parameters and be 
provided in various formats. For example, tenure, number of accidents, sales level, 
whether the employee will be mvoluntarily terminated, whether the employee will be 
20 eligible for rehire upon termination and other measures of employee effectiveness can 
be predicted. The prediction can be provided in .a variefy of forms, such as, for 
example, in the form of a predicted value, a predicted rank, a predicted range, or a 
predicted probability that an individual will belong to a group. 

Predictions can be provided by electronic means. For example, upon analysis of 
25 a job applicant's answers, an email or fax can be sent to a hking manager indicating a 
fevorable recommendation regarding the applicant. In this way, real-time processing of 
a job application to provide a recommendation can be supported. 

loformation fi"om various predictors can be combined to provide a particularly 
effective prediction. For example, a prediction can be based at least on whether (or the 
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likelihood) the applicant will be involuntarily terminated and whether (or the likelihood) 
the applicant will be eligible for rehire upon termination. Based on whether the 
individual is predicted to both voluntarily quit and be eligible for rehire upon 
termination, an accurate measure of the predicted suitability of an appUcant can be 
5 provided. 

Post-hire information can be based on payroll information. For example, 
termination status and eligibility for rehire information can be identified by examining 
payroll records. The payroll information can be provided electronically to facihtate a 
high-level of accurate post-hire information collection. 
10 Further, reports can be provided to indicate a wide-variety of parameters, such as 

applicant flow, effectiveness of the system, and others. 

Althougji the described technologies can continue to use the services of an 
industrial psychologist, relationships between pre-hire data predictors and desired job 
perfonnance criteria can be discovered and used without regard to whether the 
1 5 psychologist would predict such a relationship. A system using the described 
technologies can find relationships in data that may elude a human researcher. 

Additional features and advantages of the various embodiments will be made 
spparent fiom the following detailed description of illustrated embodiments, which 
proceeds with reference to the accorapanyiug drawings. 
20 The present invention includes all novel and nonobvious features, method steps, 

and acts alone and in various combinations and sub-combinations with one another as 
set forth in the claims below. The present invention is not limited to a particular 
combination or sub-combination. 

BRIEF DESCRIPTION OF THE DRAWINGS 
25 Figure 1 is a block diagram showing exanplary pre-hire information coUection. 

Figure 2 is a block diagram showing a predictive model based on pre-hire and 
post-hire information. 

Figure 3 is a block diagram showing ineffective predictors based on pre-hire and 
post-hire information. 
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Figure 4 is a block diagram showing rejSnement of a model over time. 
Figure 5 is a flowchart showing a method for refining a model over time. 
Figure 6 is a block diagram showing an exemplary system for providing 
employee suitability recommendations. 
5 Figure 7 is a flowchart illustrating an exemplary method for providing employee 

suitability recommendations. 

Figure 8 is a block diagram illustrating an exemplary architecture for providing 
employee suitability recommendations* 

Figure 9 is a flowchart illustrating an exemplary method for building a predictive 

10 model. 

Figure 10 is a block diagram showing an exemplary predictive model. 
Figure 1 1 is a block diagram showing an exemplary refined predictive model. 
Figure 12 is a block diagram illustrating integration of payroll information into a 
predictive system. 

Figure 13 is a block diagram illustrating an exemplary combination of elements 
into a system. 

Figures 14A-14D are block diagrams illustrating an exemplary process for 
implementing automated employee selection. 

Figure 15 is a process flow diagram illustrating an exemplary process for an 
employment suitability prediction syst^. 

Figure 1 6 is a graph illnstrating exemplary effectiveness of a system over time. 
Figure 17 is a graph illustrating entropy. 

DETAILED DESCRIPTION 
Overview of the Technologies 

On a general level, the described technologies can include collecting information 
and building a model based on the information. Such a model can then be used to 
generate a prediction for one or more desired job performance-related criteria. The 
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prediction can be the basis of a hiring recommendation or other employee selection 
information. 

Pre-hire information includes any information collected about an individual 
before the individual (e.g., a job applicant or other candidate) is hired. PIG. 1 shows a 
5 variety of sources 102 for collecting pre-hire information 112. The pre-hire information 
112 can be stored in electronic (e.g., digital) form in a computer-readable medium (e.g., 
RAM, ROM, magnetic disk, CD-ROM, CD-R, DVD-ROM, and the like). Possible 
sources for pre-hire information 1 12 include a paper-based source 122, an electronic 
device 124, a third party service 126, or some other source 128. For example, pre-hire 

1 0 information can include an applicant's answers to an on-line employment application 
collected at a remote site, such as at an electronic device located in a kiosk at a 
prospective employer's wodc site. Further information and examples are described in 
"Example 2 - Collecting Infonnation," below. 

Post-hire information includes any information collected about an individual 

15 (e.g., an employee) after the individual is hired, including information collected while 
the employee is employed or after an employee is fired, laid off^ or quits. Post-hire 
information can similarly be collected from a wide variety of sources. Post-hire 
information can include information about the employee's teraiination date. Further 
examples are described in "Example 2 ~ Collecting Information," below. 

20 As shown in FIG. 2, after pre-hire information 212 and post-hire information 

222 have been collected, a predictive model 232 can be built. As described in more 
detail below, a predictive model 232 can take a variety of forms, including artificial 
intelligence-based models. The predictive model can generate one or more predictions 
based on pre-hire information inputs. Thus, the model can be used to generate 

25 predictions for job applicants. In practice, the model can be implemented as computer- 
executable code stored in a computer-readable medium. 

As shown in FIG. 3, after pre-hire infonnation 3 12 and post-hire information 
322 have been collected, inejffective predictors 332 can be identified. Such inejSective 
predictors can be ignored when constmcting a model (e.g., the model 232 of FIG. 2). hi 
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this way, the complexity of the model can be reduced, and the efficiency of the model 
construction process can be improved. 

Further, the same ineffective predictors 332 or similar ineffective predictors can 
be removed j&om pre-hire content (e.g., ineffective questions can be removed from a job 
5 application). Identification of ineffective predictors can be achieved via software using 
a variety of techniques; examples are described below. 

As shown in FIG. 4, using various features described herein, a predictive model 
Ml (412) based on pre-hire information PRi (414) and post-hire infoimation POi (416) 
can be refined. For example, information coUection techniques can be refined by 
1 0 removing pre-hire content identified as ineffective. Further, additional pre-hire contrat 
might be added (e.g., a new set of questions can be added to a job application). 

As a result, new pre-hire information PR2 (424) based on die refined pre-hire 
content can be collected. Corresponding post-hire information PO2 (426) can be 
collected. Based on the information, a refined model M2 (422) can be constructed. 
15 The refinement process can be continued. For example, the effectiveness of the 

additional pre-hire content can be determined. Thus, refinement can continue a number 
of times over time, resulting in pre-hire information PRn (444), post-hire infoimation 
POn (446), and a refined model M„ (442). 

FIG. 5 shows an exemplary method for refining a predictive model. At 522, pre- 
20 hire information for applicants is collected based on pre-hire content (e.g., predictors 
such as questions on an employment application or predictors collected fiom other 
sources). At 532, post-hire information for the applicants is collected At 542 a 
predictive model is constmcted. The model can be deployed and model output used for 
hiring recommendations. At 552, the pre-hire content can be refined (e.g., one or more 
25 ineffective questions can be removed and one or new ones can be added). Then, 

additional pre-hire information can be collected at 522 (e.g., based on the refined pre- 
hire content). EventuaUy, a refined model can be generated. 

The various models shown can be used as a basis for providing employee hiring 
recommendations. The architecture used to implement an electronic system providing 
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such employee hiring recommendations can vary from simple to complex. FIG. 6 
shows an overview of an exemplary system 602. In the example, a computer-based 
electronic device 612 housed in a kiosk is situated in a work site (e.g., a retail store) and 
presents a job application to a job q>plicant via an electronic display 614. The 
S electronic device then sends the applicant's answers to a central server 622, which can 
also receive information from other electronic devices, such as tiie electronicdevice 
624. 

The server 622 can save the answers to a database 626 and immediately apply a 
predictive model to the answers to generate one or more predictions of employment 

1 0 perfomiance for the applicant and a hiring recommendation based on the predictions. 
Thus, real-time processing of inconaing data can be accomplished. 

The hiring recommendation can be immediately sent to a hiring manager's 
computer 642 via a network 652 (e.g., in an email via the Ihtemet). Thus, real-time 
reporting based on iacoming data can be accomplished. Although often less desirable, 

15 delayed processing is also possible. Thus, alternatively, the system can, for example, 
queue information and send it out in batches (e.g., in a set of n applicants or every n 
days) as desired. 

Various combinations and sub-combinations of the techniques below can be - 
appUed to any of the above examples. 

20 

Example 1 - Exemplary System and Method 
FIG. 7 is a flowchart showing an exemplary method 702 for providing 
automated employee selection. At 712, questions are asked of an applicant such as via 
an electronic device. The answers are collected at 722. Based on the answers, a 
25 prediction is generated at 732. Then, the results are provided at 742. 

FIG. 8 is a block diagram an exeniplary sj^tem 802 for providing employee 
selection. An electronic data interrogator 812 is operable to present a first set of a 
plurality of questions to an individual. An electronic answer capturer 822 is operable to 



wo 02/13095 



CA 02417863 2003-01-30 



PCT/USDl/24323 



electronically store the individual's responses to at least a selected plurality of the first 

set of questions presented to the individual. 

An electronic applicant predictor 832 is responsive to the stored answers and is 

operable to predict at least one post-hire outcome if the individual were to be employed 
5 by the employer. The applicant predictor 832 can provide a prediction of the outcome 

based on correlations of the stored answers with answers to sets of the same questions 

by other individuals for which post-hire information has been collected- The predictor 

832 can include a model constructed according to techniques described herein, such as 

in ^'Example 3 - Building a Predictive Model" and others. 
10 An electronic results provider 842 can provide an output indicating the outcome 

to assist in determining the suitability of the individual for employment by an employer. 
Some actions or elements might be performed or implemented by diiferent 

parties and are therefore not necessarily included in a particular method or system. For 

example, collection of data might be performed by one organization, and another might 
1 5 generate the prediction. 

Example 2 - Collecting Information 

As described with reference to FIG- 1 above, pre~hire information can be a 
variety of information collected from a variety of sources. One possible source for pre- 

20 hire information is a paper-based collection source 122, such as a paper-based job 
application or test Paper-based sources can be converted into electronic form by 
manual data entry or scanning. 

Another possible source is an electronic device 124. Such an electronic device 
can, for example, be a computer, a computer-based kiosk, a screen phone, a telephone, 

25 or a biometric device. For example, pre-hire content (e.g., a job application or skills 

test) can be presented to an applicant, who responds (e.g., answers questions) directly on 
the electronic device 124. Questions can be logically connected so that they are 
presented only if appropriate (e.g., if the employee answers affirmative to a question 
about termination, the device can then inquire as to the reason for termination). 
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Still another possible source for pre-hire information 1 12 is from a third party 
service 126. For example, credit reporting agencies, background check services, and 
other services can provide information either manually or over an online connection. 

Yet another possible source for pre-hire information 1 12 is from another source 
5 128. For example, later-developed technologies can be incorporated. 

Aay of the pre-hire information can be collected from a rmiote location (e.g., at 
a vrork site or from the applicant' s home) . The information 112 can then be stored in a 
central location, such as at an organization's information technology center or at an 
employment recommendation service's information technology center or a data 
10 warehouse. 

llie pre-hire information 1 12 can be collected for an applicant when the 
apphcant applies for a job or other times. For example, data may be obtained 
concerning individuals who have yet to apply for employnient, such as from an 
employee job search web site or firm. The response data can then be used to predict the 
15 probable job effectiveness of an applicant and the results of each prediction. Probable 
job effectiveness can be described, for example in terais of desired criteria and can 
include behavioral predictions. 

The electronic device can be placed onUne in a variety of ways. For example, an 
external telecoirununications data link can be used to upload applicant responses to a 
20 host computer and download changes in pre-hire content, administration instructions, 
data handling measures, and other administration functions. 

A modem connection can be used to connect via a telephone network to a host 
computer (e.g., central server), or a URL can be nsed to establish a web connection 
(e.g., via the Internet, an intranet, an extranet, and the like). Another network type (e.g., 
25 satellite) can be used. Li this way, real-time data collection can be implemented. 

The electronic device 124 can allow an applicant to enter text or numeric data or 
select from miiltiple response options, or register a voice or other biophysical response 
to a machine administered stimulus. The electronic device 124 can be progranmiable so 
that the presented content can be modified, and the presented content can be drawn from 

-10- 
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a remote source. Such content can include text-based questionnaires, multi-media 
stimuli, and biophysical stimuli. 

The electronic device 124 can, for example, include computer-readable media 
serving as memory for storing pre-hire content and administration logic as well as the 
5 applicant's response data. Alternatively, such content, logic, and responses can be 
stored remotely. 

The device 124, as other examples, can include a standard computer interface 
(e.g., display, keyboard, and a pointing device), hand-held digital telecommunication 
devices, digitally enabled telephone devices, touch-screen kiosk delivery systems, multi- 

10 purpose electronic transaction processors such as Automated Teller Machines, travel 
reservation machines, electronic gaming machines, and biophysical apparatus such as 
virtual reality human interface equipment and biomedicial devices. 

Further, pre-hire information can include geographic elements, allowing 
geographical specialization (e.g., by region, county, state, countiy, or the like). 

1 5 Post-hire information can similarly be collected in a variety of ways firom a 

variety of sources, including evaluations, termination information, supervisor ratings, 
payroll information, and direct measures such as sales or units produced, number of 
accidents, and the like. 

For example, after an employee has been on the job for a sufficient time, an 

20 evaluation can be made. Alternatively, upon termination of the employee, the 

employee's supervisor can rate the person^s performance in an exit evaluation or the 
employee can complete an employee exit interview. Such collection can be 
accomplished by receiving answers to questions on an electronic device, such as the 
device 124 of FIG. 1. 

25 Other available measures, such as length of service (e.g., tenure), sales, unit 

production, attendance, misconduct, number of accidents, eligibility for rdiire after 
termination, and whether the employee was involuntarily terminated may also be 
collected. Generally, post-hire information is collected for post-hire outcomes for 
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which a prediction is desired. Such outcomes can, for example, include performance or 
job effectiveness measures concurrent with employment 

Example 3 - BuOding a Predictive Model 
5 A variety of techniques can be used to build one or more predictive models for 

predicting post-hire outcomes for a job £q)plicant. The model can take one or more 
inputs (e.g., pre-hiie information) and generates one or more outputs (e.g., predicted 
post-hire outcomes). For example, a model can be based on artificial intelligence, such 
as a neural network, a structural equation, an information theoretical model, a fuzzy 

10 logic model, or a neuro-fuzzy model. ^ 

FIG. 9 shows an exemplary method 902 for building a predictive model. At 912, 
infonnation relating to inputs (e.g., pre-hire information) is collected At 914, 
information relating to outputs to be predicted (e.g., post-hire information) is collected. 
Based on the inputs and outputs to be predicted, the model is built at 916. 

15 When building a model, a variety of various proposed models can be evaluated, 

and one(s) exhibiting superior performance can be chosen. For example, various types 
of feed-forward neural networks (e.g., back propagation, conjugate gradients, quasi- 
Newton, Levenberg-Marquardt, quick propagation, delta-bar-delta, linear, radial basis 
function, generalized regression netwozk [e.g., linear], and the like) can be built based 

20 on collected pre- and post-hire data and a superior one identified and chosen. The 
proposed models can also be of different architectures (e.g., different number of layers 
or nodes in a layer). It is expected that other types of neural network types will be 
developed in the future, and they also can be used. 

Similar techniques can*be used for types of models other than neural networks. 

25 In some cases, trial and error will reveal which type of model is suitable for use. The 
advice of an industrial psychologist can also be helpful to determine any probable 
mteraction effects or other characteristics that can be accounted for when constmcting 
proposed models. 
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Various comniercially-available off-the-shelf software can be used for 
constructing artificial intelligence-based models of different types and architectures. 
For example, NEURALWORKS software (e.g., NBURALWORKS Professional 
n/Plus) marketed by NeuralWare of Carnegie, Pennsylvania and STATISTICA Neural 
5 Networks software marketed by StatSoft of Tulsa, Oklahoma can be used. Any number 
of other methods for building the model can be used. 

A model can have multiple outputs or a single output. Further, multiple models 
can be built to produce multiple predictions, such as predictions of multiple job 
performance criteria. Also, a model can be built to be geographically specialized by 
1 0 building it based on information coming from a particular region, county, state, country, 
or the like. 

Occupationally-specialized or education level-specialized models can also be 
. constructed by limiting the data used to build the model to employees of a particular 
occupation or educational level. 

15 One possible way of building a neural network is to divide the input data into 

three sets: a training set, a test set, and a hold-out set. The training set is used to train 
the model, and the test set is used to test the model and possibly further adjust it. 
Finally, the hold-out set is used as a measure of the model's ability to generalize learned 
pattern iaformation to new data such as will be encountered with the model begins 

20 processing new applicants. For example, a coefEcient (e.g., 0.43) can be calculated to 
indicate whether the model is valid based on its ability to predict values of the hold-out 
set. Various phenomeuon related to neural networks, such as over-training can be 
addressed by determining at what point during training the neural network indicates best 
perfomiance (e.g., via a test set). 

25 Identifying a superior model out of proposed models can be achieved by ranking 

the models (e.g., by measuring a validity coefficient for a hold-out set of data). During 
the ranking process, particular types (e.g., neural network or fuzzy logic) or 
architectures (e.g., number of hidden nodes) may emerge as fruitful for further 
exploration via construction of other, similar proposed models. 

-13- 
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Example 4 - Ide&tifying IneiEfective Predictors 
Ineffective (6.g., non-predictive or low-predictive) predictors can be identified. 
5 For example, using an information-theory-based technique called "information 
transfer," pre-bire content can be identified as ineffective. Generally, an ineffective 
predictor is a predictor that does not serve to effectively predict a desired job 
performance criterion. For example, answers to a particular question may exhibit a 
random relationship to a criterion and simply serve as noise in data. 
1 0 One technique for identifying ineffective predictors is to consider various sets of 

permutations of predictive items (e.g., answers to job application questions A, B, C, A 
& B, A & C, B & C, and A & B & C) and evaluate whether the permutation set is 
effective. If an item is not in any set of effective predictors, the item is identified as 
ineffective. It is possible that while an item alone is ineffective, it is effective in 
15 combination with one or more other items. Additional features of information transfer- 
based techniques are described in greater detail below. 

After predictors are identified as ineffective, various actions can be taken, such 
as onutting them when constructing a model or removing corresponding questions firom 
a job application. Or, an indication can be provided that information relating to such 
20 predictors no longer need be collected. 

Example 5 - Buildiag a Model Based on Having Identified IneCTective Predictors 

Predictors identified as ineffective can be ignored when building a model. In 
other words, one part of the model-building process can be choosing inputs for the 
25 model based on whether the inputs are effective. 

Reducing the number of inputs can reduce the complexity of the model and 
mcrease the accuracy of the model. Thus, a more efficient and effective model-building 
process can be achieved. 
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Example 6 - Exemplary Model 
FIG. 10 shows a simple exemplary predictive model 1002 with predictiye inputs 
INi, IN2, IN3, IN4 , and IN5. Various weights ai, a2, as, 84, and as can be calculated 
during model training (e.g., via back-propagation). The inputs are used in combination 
5 with the weights to genorate a predicted value, OUTi . For example, the inputs might be 
answers to questions on a job application, and the predicted value might be expected job 
tenure. 

A predictive model can estimate specific on-the-job behaviors that have been 
described for validation analysis in mathematical temis. Alfliough a two-layer model is 
10 shown, other numbers of layers can be used. In addition, various other arrangements 

involving weights and combinations of the elements can be used. In &ct, any number of 
other arrangements are possible. 

Example 7 - Refining a Model 
15 Predictors identified as ineffective can be removed from pre-hire content For 

example, if a question on a job application is found to be an ineffective predictor for 
desired job performance criteria, the question can be removed from the job application. 
Additional questions can be added (these, too, can be evaluated and possibly removed 
later). 

20 New pre-hire information can be collected based on the refined pre-hire content. 

Then corresponding new post-hire infonnation can be collected. Based on the new 
information, a refined model can be built. Such an arrangemoit is sometimes caUed 
"performance-driven systematic rotation of pre-hire content." 

In this way, questions having little or no value can be removed from an 

25 employment application, resulting in a shorter but more effective appUcation. 

Predictive content can be identified by placing a question into the pool of questions and 
monitoring whether it is identified as ineffective when a subsequent model is 
constmcted. 
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Model refinement can also be achieved through increased sample size, 
improvements to model architecture, changes in the model paradigm, and other 
techniques. 

A system using the described refinem^t process can be said to exhibit adq)tive 
5 learning. One advantage to such an arrangement is that the system can adapt to 

changing conditions such as changing applicant demographics, a changing economy, a 
changing job market, changes in job content, or chants to measures of job 
effectiveness. 



10 Example 8 - Exemplary Refined Model 

FIG. 11 shows a simple exemplary refined predictive model 1102. In the 
example, it was determined that IN4 and JN5 were ineffective predictors, so the content 
(e.g., question) related to IN4 and IN5 was removed firom.the corresponding employment 
application. Based on the finding that IN4 and IN5 were not effective predictors, they 
15 were not included in the model deployed at that time. A set of new questions was added 
to the employment application. 

When selecting new questions, it may be advantageous to employ the services of 
an industrial psychologist who can evaluate the job and determine appropriate job skills. 
The psychologist can then determine an £q)propriate question to be asked to identify a 
20 person who will fit the job. 

Subsequently, after pre-hire and post-hire information for a number of 
employees was collected, the new model 1 102 was generated fi:om the collected 
infomiation. Two of flie new questions were found to be effective predictors, so they 
was included in the refined model as INg and IN9. IN4 and IN5 do not appear because 
25 they had been earlier foimd to be ineffective predictors. 

Example 9 - Prediction Types 

A predictive model can generate a variety of prediction types. For example, a 
single value (e.g., "36 months" as a likely term of employment) can be generated. Or, a 
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range of values (e.g., "36-42 months" as a likely range of employment tenn) can be 
generated. Or, a rank (e.g., "7 out of 52" as how this applicant ranks in tenure as 
compared to S.2 other applicants) can be generated. 

Further, probabilities can be genCTated instead of or in addition to the above 
5 types. For example, a piobabiUty that an individual will be in a certain range can be 
generated (e.g., "70% - 36 or more months"). Or, a probability of a certain value can be 
generated ("5% - 0 accidents")- Or, probability of membership in a group can be 
generated (e.g., "75% involuntarily terminated"). 

Various combinations and permutations of the above are also possible. Values 
1 0 can be whatever is ^ropriate for the particular arrangement 

Example 10 - Predicted Outcomes 

Predicted post-hire outcomes can be any of a number of metrics. For example, 
nimiber of accidents, sales level, eUgibiUty for rehire, voluntary termination, and tenure 
15 can be predicted. There can be various models (e.g., one for each of the measurements) 
or one model can predict more than one. The predicted outcomes can be job 
performance criteria used when making a hiring recommendation. 

Example 11 - Hiring Recommendation 

20 After determining the suitability of the individual for employment by the 

employer, based on one or more predictions generated by one or more models, a hiring 
recommmdation can be made. The recommendation can be provided by software. 

The recommendation can include an estimate of future behavior and results can- 
be reported m behavioral terms. Alternatively, an employer might indicate the relative 

25 importance of predicted outcome values, such as a specific set of job performance 
criteria. Such information can be combined with generated predicted outcomes to 
generate an overall score. Applicants having a score over a particular threshold, for 
example, can be identified as favorable candidates. Further evaluation (e.g., a skills test 
or interview) may or may not be appropriate. 
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Example 12 - Payroll-Based Information Collection 
A problem can arise wh«3 collecting post-hire information. For example, it may 
be difficult to achieve high compliance rates for exit interviews. Also, collection of 
5 information relating to termination dates and reasons for termination may be sporadic. 

Post-hire information can be generated by examining payroll information. For 
example, a system can track whether an enq)loyee has been dropped from the payroll. 
Such an event typically indicates that the employee has been terminated. Thus, the 
employee's tenure can be detemiined by comparing the termination date with the 
10 employee's hire date. Further, available payroll information might indicate whether an 
employee was voluntarily or involuntarily teraiinated and whether or not the employee 
is eligible for rehire and why the termination occurred. Still further, the payroll 
infonnation can indicate a job change (e.g., a promotion). 

Thus, much post-hire infomiation can be commonly collected based on payroll 
1 5 information, and a higher sample size can be achieved. An exemplary arrangement 
1202 for collecting such information is shown in FIG. 12. In the example, the payroll 
infonnation 1212 is accessible by a payroll server 1222. Communication with the . 
payroll server 1222 can be achieved over a network 1242 (e.g., via the Ihtemet or 
another network). The server 1242 receives information from the payroll server 1222 
20 via the network 1232 (e.g., via any number of protocols, such as FTP, email, and flie 
like). The information is then stored in the post-hire information database 1252. For 
example, payroll infonnation can be scheduled for automatic periodic sending or maybe 
sent upon initiation by an operator. 

Although an online arrangement is shown, the information can also be provided 
25 manually (e.g., via removable computer-readable media). In some cases, the 

information may need to be reformatted so it matches the format of other data in the 
database 1252. 
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Example 13 - Exemplary Implementations 
la various implementations of the technologies, a comput6r-inq)l6m6nted system 
can be provided that collects pre-hire applicant infomiation used to assess suitability for 
employment in specific jobs. The computer system can also collect post-hire measures 
5 of the job effectiveness of employees hired using the system. 

The pre-hire and post-hire infomiation can then be converted and stored 
electronically as numeric data where such data can be logically quantified Artificial 
intelligence technology and statistical analysis can be used to identify patterns within 
the pre-hire data that are associated with patterns of job effectiveness stored in the post- 
10 hire data. Pre-hire data patterns with significant associations with different post-hire 
patterns are then converted to mathematical models (e.g., data handling routines and 
equations) representing the observed relationships. 

Following the development of interpretive algorithms that operationalize the 
pattern relationships observed in a sample of complete employment cycles, the pre-hire 
15 data collection system can then be re-programmed to run such interpretive formulas on 
an incoming data stream of new employment applications. Formula results can be 
interpreted as an estimate of the probable job effectiveness of new applicants for 
employment based on response pattern similarity to others (e.g., employees). 
Interpretive equation results can be rq>orted in behavioral terms to hiring managers who 
20 can use the information to identify and hire those applicants whose estimated job 
performance &lls within an acceptable range. 

The system can be capable of adaptive learning, or the ability to modify 
predictive inodels m response to changing data patterns. Adaptive learning can be 
operationalized using artificial intelligence technologies, short cycle validation 
25 procedures and performance-driven item rotation. The vaUdation cycle can be repeated 
periodically as new employment histories are added to the database. With successive 
validation cycles, pre-hire predictor variables that have little or no relationship to job 
effectiveness can be dropped. New item content can replace the dropped items. 
Predictive variables can be retained and used by interpretive algorithms until suflScient 
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data has accumulated to integrate the new predictors into the next generation 
interpretive algorithm. The outdated algoriflim and associated records can be archived 
and the new model deployed- Adaptive learning can enable evolutionaiy performance 
improvement, geographic specialization, and shorter, more accurate pre-hire 
5 questioimaires. 

Example 14 - Criterion Validation 

Criterion validation includes discovering and using measiues of individual 
differences to identify who, out of a group of candidates, is more likely to succeed in a 

1 0 given occupation or job. Individual differences are measures of himian characteristics 
that differ across individuals using systematic measurement procedures. Such measures 
include biographic or life history differences, standardized tests of mental abiHty, 
personality traits, work attitudes, occupational interests, work-related values and beliefe, 
and tests of physical capabilities, as well as traditional employment-related information, 

1 5 such as employment applications, background investigation results, reference checks, 
education, experience, certijScation requirements, and the like. 

Criterion validation includes the research process used to discover how these 
measures of individual differences relate to a criterion or standard for evaluating the 
effectiveness of an individual or group performing a job. Typical measures of job 

20 effectiveness include performance ratings by managers or customers, productivity 

measures such as units produced or dollar sales per hour, l^gth of service, promotions 
and salary increases, probationary survival, completion of training programs, accident 
rates, number of disciplinary incidmts or absences, and other quantitative measures of 
job effectiveness. Any of these measures of job effectiveness and others (e.g., whether 

25 an applicant will be involuntarily terminated, and the like) can be predicted via a model. 

Pre-hire metrics, including those listed above, called predictors, can be analyzed 
in relation to each criterion to discover systematic co-variatioru A common statistic 
used to summarize such relationships is the Pearson Product Moment Correlation 
coefficient, or simply the validity coefficient. If a predictor measure is found to 
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correlate with a criterion measure across many individuals in a validation sample, the 
predictor is said to be "valid," that is predictive of the criterion measure. Valid 
predictors (e.g., pre-hire infoimation) that cojrelate with specific criteria, such as post- 
hire measures (e.g., includmg concurrent performance measures) are then used in the 
evaluation of new candidates as they apply for the same or similar jobs, hidividual 
dififerences in temperament, abiUty, and other measures can have profound and 
measurable effects on organizational outcomes. 

In employee selection, an independent (e.g., '"predictor'*) variable can be any 
quantifiable himian characteristic with a measurable relationship to job perfonnance. 
Physical measurements, intelligence tests, personality inventories, work history data, 
educational attainment, and other job-related measures are typical. The dependent (e.g., 
"criterion") variable can be defmed as a dependent or predicted measure forjudging the 
effectiveness of persons, organizations, treatments, or predictors of behavior, results, 
and organizational effectiveness. 

In general, measures of job performance include objective numeric data, such as 
absenteeism, accident rates, unit or sales productivity can be readily verified from direct 
observation and are sometimes called "hard*' measures. Objective measures of job 
performance may be available for only a small set of narrowly-defined production and 
other behaviorally-specific jobs. In the absence of hard measurement, opinion data such 
as performance ratings by managers can be used for the same purpose. 

Establishing the criterion validity of a selection test or group of tests can include 
informed theory building and hypothesis testing that seeks to confirm or reject the 
presrace of a functional relationship. 

Example 15 Artificial Intelligence Techniques 
Artificial intelligence can attempt to simulate human intelligence with computer 
circuits and software. There are at least three approaches to machine intelligence: 
expert systems, neural networks, and fiizzy logic systems. Expert systems can capture 
knowledge of human experts using rule-based programs to gather information and make 
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sequential decisions based on facts and logical branching. These systems involve 
human experts for constructing the decision models necessary to simulate human 
information processing. Expert systems can be used to standardize complex procedures 
and solve problems with clearly defined decision rules. 
5 Neural networks (also commonly called "neural systems," "associative 

memories," "connectionist models/' ^'parallel distributed processors," and the like) can 
be computer simulations of neuro-physiological stractures (e.g., nerve cells) found in 
nature. Unlike expert systems, artificial neural netwo±s can learn by association ox 
experience, rather than bemg programmed. Like their biological counterparts, neural 

1 0 newoiks form internal representations of the external world as a result of exposure to 
stimuli. Once trained, they can generalize or make inferences and predictions about 
data that they have not been exposed to before. Neural networks are able to create 
internal models of complex, nonlinear multivariate relationships, ovm when the source 
data is noisy or incomplete. It is this capacity to function with uncertain or jfiizzy data 

15 that makes a neural processor valuable in the real world. 

Fuzzy computation includes a set of procedures for representing set membership, 
attributes, and relationships that cannot be described using single point numeric 
estimates. Fuzzy systems can allow computers to represent words and concepts such as 
vagu^ess, uncertainty, and degrees of an attribute. Fuzzy systems can allow computers 

20 to represent cornplex relationships and interactions between such concepts. They can 
also be a useful tool for describing bimian attributes in terms that a computer can 
process. Fuzzy concepts and fuzzy relationship models can be used in an employee 
selection system to represent predictor-criterion interactions when such relationships are 
supported by analysis of the available data. 

25 Neuro-fuzzy technology is a hybrid artificial inteUigence technique employing 

the capabilities of both neural network learning and fuzzy logic model specification, Li 
an employee selection system, predictor-criterion relationships can be described rra"tially 
as a fiizzy model and then optimized using neural network training procedures. In the 
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absence of evident explanatory predictor-criterion relationships, unspecified neural 
networks can be used until such relationships can be verified. 

Genetic algorithms can represent intelligent systems by simulating evolutionary 
adaptation using mathematical procedures for r^roduction, genetic crossover, and 
5 mutation, in an employee selection system^ genetic algorithm-based data handling 
routines can be used to compare the prediction potential of various combinations of 
predictor variables to optimize variable selection for model development. 

Information theoretic based feature selection can be based on information tibieory. 
Such a technique can use measures of information transmission to identify relations 

1 0 between independent and dependent variables. Since information theory does not 

depend on a particular model, relation identification is not limited by the nature of the 
relation. Once the identification process is complete, the set of independent variables 
can be reduced so as to include only those variables with the strongest relationship to 
the dependent variables. 

15 Such a pre-filtering process facihtates the modeling process by removing inputs 

which are (e.g., for the most part) supafluous and would therefore constitute input noise 
to the model. A reduction in the dimensionaUty of the input vector to the model also 
reduces the complexity of the model and in some cases (e.g., neural netwoiks), greatly 
reduces the computational expense involved in model generation. 

20 Information theoretic-based modeling techniques such as reconstructabiHty 

analysis can be used in an employee selection system. Such techniques use 
informational dependencies between variables to idmtify the essential relations within a 
system. The system is then modeled by reproducing the joint probability distributions 
for the relevant variables. The benefits of such modeling techniques include that .they 

25 do not depend on a model and can emulate both deterministic and stochastic systems. 

An employee selection system can include adaptive learning technology. Such a 
system can be constructed as a hybrid artificial inteUigence application, based in part on 
various (or all) of the above artificial intelUgence technologies. Expert systems can be 
employed to collect and process incoming and outgoing data, transfer data between sub- 
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systems internally and in model deployment. Neural networks can be used for variable 
selection, model development, and adaptive learning. Fuzzy set theory, fiizzy variable 
definition, and neuro-fuzzy procedures can be used in variable specification, model 
definition, and refinement. Genetic algorithm techniques can be used in variable 
S selection, neural network architecture configuration and model development and testing, 
kiformation theoretic feature selection and modeling techniques can be used in data 
reduction, variable selection, and model development 



Example 16 - Electronic Repository System 
10 Externally-collected data can be sent to an in-bound communications sub-system 

that serves as a central repository of information. Data can be uploaded via a variety of 
techniques (e.g., telephone lines, Internet, or other data transfer mechanisms). The in- 
bound communications sub-system can include a set of software programs to perform 
various fimctions. 

15 For example, the sub-system can receive incoming data fi-om extemal data 

collection devices. The incoming data can be logged with a date, time and source 
. record. Data streams can be stored to a backup storage file. 

After data reception, the subsystem can respond to the source device with a text 
message indicating that transmission was successful or imsuccessfiil; other messages or 
20 instmctions can be provided. The data stream can be transferred to a transaction 
monitor (e.g., such as that described below) for further processing. 

The subsystem can also download machine-specific executable code and 
scripting files to extemal data collection devices when changes to the user-interface are 
desired. The download transmissions can be logged by date, time, and status and the 
25 extemal device's response recorded. 



Example 17 - Transaction Monitor 
A transaction monitor can serve as an appUcation processing system that directs 
information flow and task execution between and among subsystems. The transaction 
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monitor can classify incoming and outgoing data streams and launch task-specific sub- 
routines using multi-threaded execution and pass sub-routine output for fijrfher 
processing until transactions (e.g., related to data streams) have been successfully 
processed. 

S A transaction monitor can perform various functions. For example, the 

transaction monitor can classify data streams or sessions as transactions after 
transmission to an in-bound conxmunications sub-system. Classification can indicate 
the processing tasks associated with processing the transaction. 

Data can be parsed (e.g., fonnatted into a pre-defined structure) for additional 
10 processing and mapped to a nomialized relational database (e.g., the applicant database 
described below). Data elements can be stored with unique identifiers into a table 
containing similar data jfrom other sessions. 

Session processiag task files can be launched to process parsed data streams. 
For example, an executable program (e.g., C++ program, dynamic link library, 
15 executable script, or the like) can perform various data transmission, transformation, 
concatenation, manipulation or encoding tasks to process the sessions. 

Output firom session processing tasks can then be fonnatted for fiirther 
processing and transmission to external reporting devices (e.g., at an employer's site). 
For example, the imaging and delivery sub-system described below can be used. 

20 

Example 18 *- Applicant Database 

A relational database can store pre- and post- employment data for session 
transactions that are in process or were received and recently processed. As individual 
session records age, they can be systematically transferred to another storage database 
25 (e.g., the rq>orts database described below). 

Both databases can consist of electronically-stored tables made up of rows and 
columns of mmieric and text data. In general, rows contain identifier keys (e.g., unique 
keys) that link elements of a unique session to other data elements of that session. 
Columns can hold the component data elements. Unique session data can be stored 
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across many tables, any of which may be accessed using that session's unique 
identification key. 

An airangement of three basic types of data can be used for the applicant 
database. First, standard pre-hire application information (e.g., name, address, phone 
5 number, job £5)plied for, previous experience, references, educational backgroimd, and 
the like) can be stored. Also, included can be applicant responses to psychological or 
other job-related assessments administered via an external data collection device (e.g., 
the electronic device 124 of FIG. 1). 

Second, post-hire data about the job performance of employees afta: being hired 
10 can be stored. Such data can include, for example, supervisor opinion ratings about the 
employee's overall job performance or specific aspects of the employee's job 
effectiveness. Quantitative indicators about attendance, sales or unit production, 
disciplinary records and other performance measures may also be collected. 

Third, employer-specific information used to process transactions can be stored. 
15 Such data can include information for sending an appropriate electronic report to a 
. correct employer location, information related to downloading user interface 
modifications to specific data collection devices, and information for general 
management ofinformation exchange between various sub-systems. For example, 
employer fax numbers, URL's, email accounts, geographic locations, organizational 
20 units, data collection unit identifier, and the like can be stored. 

Other information or less information can be stored in the database. Further, the 
database may be broken into multiple databases if desired. 

Example 19 - Reports Database 

25 A reports database can be a relational database serving as a central repository for 

records processed by the applicant database. AppUcant records for applicants not hired 
can be deleted. Applicant records for applicants aged over a certain client-specified 
record retention time limit can be deleted. 
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The rq)orts database can be used as a source for the data used in generating, 
printing, or posting corporate reports (e.g., such as those described below). Such data 
can include client-specific records of employment applications received for recent 
reporting periods, plus pre-hire predictor and post-hire criterion performance data. 

5 

Example 20 - Corporate Reports 

Useful information can be collected in the course of operating a hiring 
recommendation system. For example, information about applicant flow, hiring 
activity, employee turnover, recruiting costs, number of voluntary terminatiotts, 
10 applicant and employee characteristics and other employee selection metrics can be 
collected, stored, and reported. 

Standardized reports can be provided to employers via printed reports, fax 
machines, email, and secure Internet web site access. Source data can come from the 
reports database described above. Custom reports can also be generated. 

15 

Example 21 - Sample Size Monitor 
A sample size monitor can be provided as a computer program that monitors the 
quality and quantity of incoming data and provides an indication when a sufficient 
numba: or predictor-criterion paired cases have accumulated. For example, employer- 
20 specific validation data can be transferred to a model development environment upon 
accumulation of sufficient data. 

The program can iise an expert system decision rule base to keep track of how 
many complete employee life cycle histories are in a reports database. In addition, the 
software can examine and partition individual records that may be unusable due to 
25 missing fields, corrupted data, or other data fidelity problems. Using pre-defined 
sample size boundaries, the software can merge available pre- and post-hire data 
transfer and transfer a file to the validation queue (e.g., the queue described below). 
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Example 22 - External Service Providers 
A system can interface wifli ottier online data services of interest to employers. 
Using a telecommunication link to third party service computers, a transaction monitor 
can relay applicant information to trigger delivery of specialized additional pre-hire data 
5 which can then be added to an applicant database and used in subsequent analysis and 
reporting. Such services can include, for example, online work opportunity tax credit 
(WOTC) eligibihty reporting, online social security number verification, online 
background investigation results as iodicated by specific jobs, and psychological 
assessment results, including off-line assessment. Such services are represented in FIG. 
10 1 as the third party service 126. 

Example 23 Validation Queuing Utility 

Validation queuing utility software can be provided to serve as a temporary 
storage location for criterion validation datasets that have not yet been processed in a 
15 model development envirorunent (e.g., such as that described below). Datasets can be 
cataloged, prioritized, and scheduled for fiirther processing using predefined decision 
rules. When higher priority or previously-queued datasets have been processed, the file 
can be exported to the analysis software used for model development 

20 Example 24 - Model Development Technique 

Model development can result in the creation of a model that represents 
observed fimctional relationships between pre-hire data and post-hire data Artificial 
intelligence technologies can be used to define and model such relationships. Such 
technologies can include expert systems, neural networks and similar pattern fiihction 
25 simulators, fiizzy logic models, and neuro-fiizzy predictive models. 

Various procedures can be implemented. For example, the distribution of pre- 
hire variables (sometimes called '^independent" or "predictor variables") can be 
analyzed in relation to the distribution of post-hire outcome data (sometimes called 
"dependent" or "criterion variables"). 
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Using statistical and infonnation theory derived techniques, a subset of predictor 
variables can be identified that show infonnation transfer (e.g., potential predictive 
validity) to one or more criterion variables. 

An examination of joint distributions may result in the formalization of a fuzzy 
S theoretical model and certain predictors may be transformed to a fuzzy variable format 

If an obvious theoretical model does not emerge jfrom this process, the remaining 
subset of promising variables can be categorized and transformed for neural network 
training. Non-useful (e.g., ineffective) predictor variables can be dropped fibom further 
analysis. 

10 The total sample of paired predictor-criterion cases (e.g., individual employee 

case histories) can be segmented into three non-overlapping sub-samples with group 
membership being randomly defined. Alternate procedures, such as randomized 
membership rotation may also be used to segment the data. 

A training set can be used to train a neural network or neuro-fiizzy model to 

15 predict, classify, or rank the probable criterion value associated with each instance of 
predictor input variables. A test set can be used to evaluate and tune the performance 
(e.g., predictive accuracy) of models developed using the training set. A hold-out or 
independent set can be used to rank trained networks by their ability to generalize 
learning to unfamihar data. Networks with poor predictive accuracy or low 

20 generalization are dropped firom further development. 

Surviving trained models can then be subjected to additional testing to evaluate 
acceptabihty for operational use in anployee selection. Such testing can include 
adverse impact analysis and selection rate acceptabihty. 

Adverse impact analysis can evaluate model output for differential selection 

25 rates or bias against protected groups. Using independent sample ou^ut, selection rates 
can be compared across gender, ethnicity, age, and other class differences for bias for or 
against the groups. Models which demonstrate differential prediction or improper bias 
can be dropped Scorn further development. 
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Selection rate acceptability can include evaluation of selection rates for 
hire/reject classification models. Selection rates on the independent sample can be 
evaluated for stringency (e.g., rejects too many applicants) or leniency (e.g., accepts too 
many applicants) and models showing these types of enrors can be dropped. 
S Final candidate networks can be ranked according to their performance on test 

parameters, and the single best model can be converted to a software program for 
deployment in a live employee selection system. The coded program can then be passed 
to the deployment and archiving modules (e.g., such as those described below). 

Such an iterative process can be repeated as different predictor-criterion 
10 relationships em^ge. As sufficient data accumulates on specific criterion outcomes, 
additional predictive models can be developed. Older models can eventually be 
replaced by superior perfonniag models as item content is rotated to capture additional 
predictive variation (e.g,, via the item rotation module described below). Sample size 
can continue to increase. Thus, a system can evolve toward higher predictive accuracy. 

15 

Example 25 - Model Deployment Technique 
Deployment of a model can include a hiring report modification and model 
insertion. The hiring report modification can include modifications to an imaging and 
delivery subsystem and an ^pUcant processing system (e.g., the above-described 
20 transaction monitor). 

To facilitate employer use of model predictions, numeric output can be 
translated into text, number, or graphics that are descriptive of the behavior being 
predicted. Output can be presented to an employer in behavioral t^ms. 

When a criterion to be predicted is a number, the exact numeric estimate can be 
25 couched in a statement or picture clearly describing the predicted behavior. For 

example, if the model has produced an estimate of an applicant's probable length of 
service in days, the hiring report can be modified to include a statement such as the 
following example: 
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Based on similarity to former employees, this applicant's 
estimated length of service is X days, plus or minus Ydays margin of 
error. 

5 X can be tbe specific number of days that the trained predictive model has provided as 

an estimate of the applicant's probable length of services, and Y can be the statistical 

margin of eiror in which the majority of cases will tend to fall. 

When the criterion to be predicted is group membership (e.g., whether or not the 

appUcant is likely to belong to a specific group), the model estimate maybe expressed 

10 as a probability, or likelihood, that the applicant will eventually be classified in that 

group. For example, if the predictive model has been trained to classify employee 

response patterns according to the probability that they would be eligible for rehire 

instead of not being eligible for rehke upon termination, a statement or graphic similar 

to the following example can be presented on a hiring report: 

15 Based on similarity to former and/or current employees, this 

applicant's probability of being eligible for rehire tipon termination isX 
percent. 

X can be a probability function expressed as a percentage representing the number of 
20 chances in one hundred that the particular applicant will be eligible for rehire when he 

or she leaves the company. 

When the criterion produced is a ranking or relative position in a ranked 

criterion, text or graphic images can be used to convey the applicant's position m the 

criterion field. For example, if the model has produced an estimate of the probable rank 
25 of a sales employee's annual sales volume compared to past sales employees, a 

statement similar to the following example might be used: 

Based on similarity to former sales employees, this applicant is 
likely to produce annual sales in the top Xth (e.g., third, quarter , fifth, or 
the like) of all sales employees. 

X can refer to the ranking method used to classify the criterion measure. 

Such text-based reporting methods as described above can be summarized, 

illustrated with, appended to, or replaced by graphic images representing the behavioral 
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infonnation. For example, charts, graphs, images, animated images, and other content 
fonnat can be used. 

Applicant processing system model insertion can be accomplished by 
embedding a coded model in the application processing conducted by a transaction 

5 monitor after the fonnat of the predictive output has been deteimined. Data handling 
routines can separate model input variables from &e incoming data stream. The inputs 
can be passed to the predictive model and be processed. The output of the model can 
then be inserted or transfonned mto a reporting format as described above and added to 
a hiring report transmission. 

10 , 

Example 26 - Validation Archives 
As a new model is deployed, the replaced model can be transferred to an archive 
storage. The archive can also record applicants processed by the old model. Such an 
archive can be useful if reconstruction of results for a decommissioned model is desired 

1 5 for administrative or other reasons. 

Example 27 - Exemplary Item Rotation Technique 
An item rotation module can be impl^ented as a software program and 
database of predictor item content. The item rotation module can be used to 
20 systematically change pre-hire content so that useful predictor variables are retained 
while non-iisefiil (e.g., ineJBFective) predictors can be replaced with potentially useful 
new predictors. 

Adaptive learning includes the ability of a system to improve accuracy of its 
behavioral predictions with successive validation cycles. Iterative neural network and 
25 neuro-fiizzy modd development and performance-driven item rotation can be used to 
facilitate adaptive learning. 

As part of a validation analysis for a model, predictor variables (e.g., pre-hire 
questions or items) predictive of a criterion measure can be identified. At the same 
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time, other predictors with Uttle or no modeling utility (e.g., ineffective predictors) can 
be identified. 

Performance-driven item rotation includes the practice of systematically 
retaining and deleting pre-hire content so that item content with predictive utility 
S continues to serve as input for behavioral prediction with the cuirent predictive model 
and items with Uttle or no predictive utihty are dropped firom the content New, 
experimental item content can be ioserted into the content and response patterns can be 
recorded for analysis in the next validation cycle. 

Such rotation is shown in Tables 1 and 2. 

10 

Table 1 - Item Content Darii^ Validation Cycle #1 



Item 


Status 


You help people a lot 


Ineffective 


You tease people until they get mad 


Ineffective 


You have confidence in yourself 


Effective 


You would rather not get involved in 
other*s problems 


Ineffective 


Common sense is one of your greatest 
strengths 


Ineffective 


You prefer to do things alone 


Effective 


You have no fear of meeting people 


Effective 


You are always cheerful 


Ineffective 


24x7 = ? 


Ineffective 


You get mad at yourself when you make 

mistakes 


Ineffective 


How many months were you at your last 
job? 


Effective 


Table 2 - Item Content After Validation Cyde #1 


Item 


Status 


Many people cannot be trasted 


New experimental item 


You are not afiaid to tell someone off 


New experimental item 


You have confidence in yomself 


Effective - retained 


You try to sense what others are thinking 
and feeling 


New experimental item 
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Vmi affr^rf attpntinn tn vniif^splf 


New exnerimehtal item 


You prefer to do things alone 


Effective - retained 


You have no fear of meeting people 


Effective - retained 


You can wait patiently for a long time 


New experimental item 


You say whatever is on your mind 


New experimental item 


Background check item 


New experimental item 


How many months were you at your last 
iob? 


Effective — retained 



The content shown in Table 1 has been refined to be that shown in Table 2, 
based on the effectiveness of the predictor items. New experimental items have been 
added, the effectiveness of whidi can be evaluated dxiring subsequent cycles. 
5 As successive validation cycles are completed and non-predictive item content is 

systematically replaced with predictive item content, overall validity improves. After 
multiple validation cycles, the result can be a shorter pre-hire questionnaire comprised 
of currently-perfonning predictive input arid a few experimental items being validated 
in an on-going process for system evolution toward higher predictive accuracy. 

10 

Example 28 - Imaging and Delivery Subsystems 

Imaging and delivery subsystems can assemble input from applicant processing 
to create an electronic image that resembles a traditional employment application that 
can be transmitted to an employer's hiring site via external data devices (e,g., fax 
15 machine, computer with email or web access, hand-held devices, digitally enabled 

telephones, printers, or other text/graphics imaging devices). Hiring reports can also be 
delivered as hard copy via mail or other dehvery services. 

Example 29 - Hire Site Report Reception 

20 Hiring managers can receive an electronic report that can be printed or simply 

saved in electronic format The entire application process can occur in real-time or 
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batch mode (e.g., overnight bulk processing). Real-time processing can result in hiring 
report reception minutes after pre-hire data is uploaded. Such rapid rq)ort reception can 
be an advantage of the system. 

S Example 30 - Exenqslary Combination of Elements 

The various above-described elements can be combined in various combinations 
and sub-combinations to construct a sjretem. For exanqjle, FIG. 13 shows an eKemplary 
combination of elements. 

Pre-hire and post-hire data collection elements 13 12 can send, via the incoming 
10 communications subsystem 1316, infomiation to the transaction monitor 131 8. The 
information can be stored in the applicant database 1322 while processed and then 
stored in the reports database 1324. The reports database 1324 can be used to produce 
corporate reports 1328. 

A sample size monitor 1332 can monitor the reports database 1324 and send 
15 information, via the vahdation queue 1338, to the predictive model development 
environment 1342, Models from the development environment 1342 can be sent for 
model deployment 1 348, including hiring report modification and model insertion. 

Archived models can be sent to the validation archives 1352, and an iteni 
rotation module 1358 can track rotation of predictive content Imaging and delivery 
20 subsystems 1372 can deliver hire site reports 1378. 

External service providers 1388 can interfece with the system 1302 to provide a 
variety of data such as applicant pre-hire information (e.g., background verification, 
credit check information, social security number verification, traffic and criminal 
information, and the like). 
25 Fewer or additional elements can be included in a system. 



Example 31 - Exemplary Process Overview 

The various techniques described above can be used in a process over time. In 
such a process, adaptive learning can improve employee selection with successive 
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validation cycles as sample size increases and predictor input systematically evolves to 
capture more criterion relationships and higher predictor-criterion fidelity. An example 
is shown in FIGS. 14A-14D. 

FIG. 14A shows a first cycle 1402. For example, when an employer first begins 
S to use a system, applicants enter pre-hire appUcation and assessment responses using 
external data collection devices. The data can be stored and processed as described 
above, except that as of yet no behavioral predictions appear on the hiring report 
because a sufficient number of employee histories has not yet been captured by the 
system. 

10 As employee job performance measures are taken, employees leave and 

complete exit interviews and their managers complete an exit evaluation, or payroll 
information is collected also using the external data collection devices, employee 
histories are added to the database. The rate of data accimiulation is a function of how 
quickly people apply, are hired, and then terminate employment. An alternative to 

15 capturing post-hire job performance data upon termination is to collect similar data on 
the same population prior to termination on a concurrent basis, hi the example, the size 
of the validation database is small, there is no adaptive learning, there are no predictive 
models, and there are no behavioral predictions. 

When a suflBcient sample of employee histories is available, validation and 

20 predictive modeling can occur. Following model development, the second validation 
cycle 1422 can begin as shown in FIG. 14B. InefFective pre-hire variables are dropped 
or replaced witii new content and the pre-hire application is modified. Applicant and 
terminating employee processing continues and more employee histories are added to 
the database. In the example, the validation database is medium, there is at least one 

25 predictive model, and there is at least one behavioral prediction (e.g., length of service 
or tenure). 

A third validation cycle 1442 is shown in FIG. 14C. Initially, predictive 
modeling might be limited to behavioral criteria commonly observed, such as length of 
service, rehire eligibility, or job performance ratings because sample sufficiency occurs 
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jfirst with such common measures. Other less frequently occurring data points (e.g., 
misconduct tenninations) typically accumulate more slowly. As managers begin using 
the behavioral predictions to select new employees, the composition of the workforce 
can begin to change (e.g., newer employees demonstrate longer tenure, higher 

5 performance, and the like). 

As usable samples are obtained for different criteria (e.g., post-hire outcomes), 
new models are developed to predict these behaviors. Older predictive models can be 
replaced or re-trained to incorporate both new item content from the item rotation 
procedure and additional criterion variation resulting from the expanding number of 

10 employee histories contained in the validation database. In the example, the validation 
database is large, there are differentiated models, and a number of behavioral 
predictions (e.g., tenure, early quit, and eligibility for rehire). 

Fourth and subsequent validation cycles 1462 are shown in FIG. 14D. Multiple 
iterations of the validation cycle using larger and larger validation samples result in 

15 multiple complex models trained to produce sucessively-improving behavioral 

prediction across the spectrum of measurable job-related outcomes (e.g., eligibility for 
rehire, tenure, probable job performance, probability of early quit, job fit, misconduct, 
and the like). In the example, the validation database is very large, there are complex, 
differentiated models, and many behavioral predictions. 

20 The behavioral predictions can become more accurate the longer the system is in 

place. If used consistently over time, the workforce may eventually be comprised 
entirely of employees selected on the basis of their similarity to successfid former 
employees. Continued use of the adaptive learning employee selection technology can 
be expected to produce positive changes in the global metrics used to assess workforce 

25 effectiveness. Such metrics include lower rates of employee delinquency (e.g., theft, 
negligence, absenteeism, job abandonment, and the like), higher rates of productivity 
(e.g., sales, unit production, service delivery, and the like), longer average tenure and 
reduced employee turnover, and higher workforce job satisfaction and more effective 
employee placement. 
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Example 32 - Exemplary Process Overview 
FIG. 15 is aprocess flow diagram illustrating an exemplary process 1502 for an 
employment suitability prediction system. At 1512, data is collected. Such collection 
5 can be accomplished in a wide variety of ways. For exanqjle, electronic data collection 
units can be distributed, or a URL can be used by employment applicants. 

Electronic versions of a standard employment application or tests can be 
deployed. Also, post-hire data coUection can be accomplished by deploying post-hire 
data collection questionnaires and via payroll data transfer. Also, manager feedback 
10 report apparatus (e.g., fax back reports or e-mail report of results) can be deployed so 
managers can receive information such as hiring reconamendations. The service can 
then be implemented, and data collection can begin. 

At 1 522, feature selection can take place. Pre-hire application records can be 
extracted &om an applicant processing system, and post-hire outcome data can be 
15 extracted from a reports database. Pre- and post- data can be sorted and matched from 
botii sources to create a matched predictor-criterion set. Information theoretic feature 
selection can be run to identify top-ranking predictive items based on information 
transmission (e.g., mutual information). Item data characterized by marginal mutual 
information can be deleted and a distilled predictive modeling dataset can be saved. 
20 At 1532, model development can take place. The distilled predictive modeling 

dataset can be randomized and partitioned into training, testing, and verification subsets. 
A group of models (e.g., neural networks) that meet performance criteria thresholds can 
be built by experimenting with multiple neural network paradigms, architectures, and 
model parameters. 

25 The models can be tested for their ability to generalize (e.g., apply learned 

pattern information from training and test sets to the verification dataset). Non- 
generalizing models can be discarded and the surviving models can be saved. 
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Surviving models can be tested for differential prediction, adverse impact and 
other anomalies. Biased nets can be discarded Unbiased models can be ranked and 
saved. 

At 1542, model deployment can take place. The top-performing surviving 
5 model can be converted to software coromand code. The code can be integrated into a 
custom session .processing task which executes model processing and exports the output 
to an imaging program and hiring report generator. 

The new session processing task can be tested for appropriate handling and 
processing of the incoming data stream values in a software test environment The 
10 session processing task code can be rejQned and debugged if necessary. Then, the new 
task can be deployed in an op^tional applicant processing system. 

At 1552, performance tuning can take place. Data collection can continue. 
Sample size can be monitored as incoming data accumulates. When an update threshold 
is reached, new cases can be added to the matched predictor-criterion set by repeating 
15 feature selection 1522. Item content can be revised using a performance driven item 
rotation procedure (e.g., replace or remove survey items with marginal information 
transmission). Model development 1532, model deployment 1542, and performance 
tuning 1552 can then be repeated. 

20 Example 33 - Effectiveness of a Model 

Real-time electronic collection of data and sample size-driven refinement of 
models can result in high model eflfectiveness. For example, FIG. 16 shows a graph 16 
in which effectiveness 1622 of a reference system is shown. As conditions change over 
• time, the effectiveness 1622 of the system decreases. The mean effectiveness 1624 is 
25 also shown. 

As system employing real-time electronic data collection and sanq)Ie size-driven 
model refinement can exhibit the effectiveness 1632 as shown. As the model is refined, 
the effectiveness of the model increases over time. Thus, the mean effectiveness 1634 
is greater, resulting in a more effective system. 
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Example 34 - Exemplary Automated Hiring Recommendatioii Service 
Using various of the technologies, a method for providing an automated hiring 
recommendation service for an employer can be provided. Electronic devices can be 
5 stationed at employer sites (e.g,, retail outlets). The electronic devices can directly 
accept pre-bire information from job applicants (e.g., answers to questions from a job 
application). The pre-hire infonnation can then be sent to a remote site (eg., via a 
network of telephone corinection) for analysis. An artificial intelligence-based 
predictive model or other model can be applied to the pr^hire information to generate 
10 an automated hiring recommendation, which can be automatically sent to the employer 
(e.g., via email). 

Example 35 - Exemplary Implementation 
A behavioral prediction model can be developed to generate an estimate of the 

15 tenure (length of service in days) to be expected of applicants for employment as 
customer service representatives of a national chain of video rental stores. Such 
predictions can be based on the characteristics and behaviors of past employees in the 
same job at the same company. Application of the model can result in higher average 
tenure and lower employee turnover. 

20 As a specific example, pre-hire apphcation data used to develop this exemplary 

model was collected over a period of a year and a half using an electronic employment 
application as administered using screen phones deployed in over 1800 stores across the 
United States. Termination records of employees hired via the system were received by 
download. Over 36,000 employment £5)plications were received in the reporting period, 

25 ofwhich approximately 6,000 resulted in employment. Complete hire to termination 
records were available for 2084 of these employees, and these records were used to 
develop the model. 

When building the model, definition of system inputs and outputs was 
accomplished. Independent or predictor variables can be measures of individual 
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characteristics thought to be related to a behavior or outcome resulting from a behavior. 
In industrial psychology and employee selection, typical predictor variables might be 
measxires of education^ experience or performance on a job-related test. Criterion 
variables can be measures of Ihe behavior or outcome to be predicted and mig^t include 
S sales effectiveness, job abandonment^ job perfoimance as measured by supervisor 
ratings, eaq)loyee delinquency and other behavioral metrics or categories. 

In this example^ piedictor variables are inputs and criterion variables are outputs. 
In this research, input variables consist of a subset of the employment ^plication data 
entered by applicants when applying for jobs (see Tables 4 and 5 for a listing of the 

10 variables used in this model). The output or criterion is the number of days that an 
employee stayed on the payroll. 

The process of identifying the subset of predictor variables to be used in a model 
is sometimes called "feature selection." While any information gathered during the 
employment application process may have predictive value, the set of predictors is 

1 5 desirably reduced as much as possible. The complexity (as measured by the number of 
network connections) of a network can increase geometrically with the number of 
inputs. As complexity increases so can training time along with the network^s 
susceptibility to over-training. Therefore inputs with less predictive power can be 
eliminated in favor of a less complex neural network- model. 

20 For the tenure prediction model in this illustrative example, infonnation 

theoretic methods were employed to determine the subset of input variables that 
maximized information transmission between the predictor set and the criterion. Such 
an approach can rely on the statistical flieoiy of independent events, where events 
PvPiP'-'iPn <^^^id€Xod statistically independent if mdohiy if (h^ P, 

25 that ttiey occur on a given trial is 



n 




(1) 
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Conversely, the measurement of how much a joint distribution of probabilities differs 

from the independence distribution can be used as a measure of the statistical 

dependence of the random events. 

Information theoretic entropy can provide a convenient metric for estimating the 
5 diflEerence between distributions. The entropy, H{X) (measured in bits) of the 
distribution of a discrete random variable X with n states can be 

mx)^-'Zp,iog,p, (2). 

where is the probabiUty of state / . Entropy can be maximized when a distribution is 
uniform. For example, FIG. 17 shows a graph 1702 of the entropies 1722 of a smgle 

1 0 variable, discrete 2-state distributions and how then: probabilities vary. 

Sinailarly, for a multivariate distribution constrained by specified marginal 
distributions, the distribution that maximizes entropy can be the independence 
distribution. Therefore, given a joint distribution with jfixed marginals, the distribution 
that niimmizes entropy can be the distribution for which the variables are completely 

1 5 dependent Dependence can be viewed as constraint between variables and as constraint 
is reduced, entropy increases, hiformation theoretic analysis of a distribution is then the 
measurement of constraint. Decreasing entropy can indicate dependence (minimal 
entropy, maximum constraint), and increasing entropy can indicate independence 
(maximum entropy, minimum constraint). Assuming some constraint between 

20 variables, sampled distribution can lie somewhere between complete dependence and 
independence and have a measurable entropy. 

If we are analyzing the joint distribution of the variables X and Y , the entropy 
for this sampled distribution can be H{XY) . The entropies of the variables X and 
/measured separately are H{X) and HiJ) and can be computed using the marginals 

25 of the joint distribution. 
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Since H{X) and H(Y) are calculated from the marginals and entropy can be 
logarithmic, 

HiX) + H{Y)=^H{XY) (3) 

if there is no constraint hetweea X and Y . 
5 On 

H{XY)^H(X)'^HiY) (4) 
if and only if X and Y are independent. 

This equality can indicate that there is no relationship between X and Y and the 
joint distribution of the variables is the independence distribution. 
10 Information transmission T can be the measure of the distance between 

distributions along the continuum described above. For discrete random variables X 
and 7, T{X : Y) the infoimation transmission between X and Y, is computed: 
T{X : Y) = H(X) + H{Y)r-H{XY) (5) 
T(X :Y) is the difference between the entropies of the independence distribution and 
the sampled joint distribution. The degree of dependaice between X and Y can 
1 5 therefore be computed by measuring information transmission. A small value for 
T(X : Y) indicates the variables X and Y are nearly independent, whereas a large 
value suggests a high degree of interaction. 

In a directed system, such as a predictive model, the measure of information 
transmission between the distribution of an independent variable X and a dependent 
20 variable Y can be used to gauge the predictive value of X . The goal can be to find a 
subset S of the independent variables V such that, for the set of dependent variables 
D: 

T(D:V)^T(D:S) (6) 

However, as discussed, the modeling technique to be employed may limit the cardinality 
of S so the filtering process can be guided by the following considerations: 
25 1 . if 5' is any subset of V smaller than S , then r(jD : S*) is significantly 

smaller than r(Z): 5). 
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2. if 5' is any subset of V larger than S , then r(jD : S ) is not significantly 
larger than T{D :S) 

Since information theoretic transotnission can measure the degree of difference between 
distributions of variables, without regard to the nature of the difference, the technique 

5 can be considered 'Wdel firee". This property allows the methodology to work as an 
effective filter regardless of the subsequent modeling techniques employed. 

When this type of feature selection was applied to tenure prediction, 56 
questions (see Tables 4 and 5) were selected has having the most predictive value with 
respect to applicant tenure. 

10 Once the set of predictor variables or inputs has been defined and the output 

criterion variable specified, a neural network model can be trained. For the tenure 
prediction model, 2084 cases were available. This sample was divided into training, test 
and verification sets. The training set contained 1784 cases and the verification and test 
: sets contained ISO cases each. 

1 5 The b^t performing neural network architecture was found to be a single hidden 

layer feed-jforward network with 56 input nodes and 40 hidden layer nodes. 

The network was developed with the STATISTICA Neural Networic package 
using a combination of quick-propagation and conjugate gradient training. 

The performance on the training and verification sets began to diverge 

20 significantly after 300 epochs. This was deemed to be the point of over-training. 

Optimal performance on the hold-out sets was achieved at 100 epochs. The results are 
shown in Table 3, which contains final distribution statistics of model output for each of 
the three data subsets. Unadjusted correlation and significance statistics are in relation 
to actual tenure. By any standard, an employee selection procedure with a correlation in 

25 the .5 range with a job-related criteria is not merely acceptable, but exceptional. Many 
validated selection procedures in use today were implemented on the basis of validity 
coeflBcients in the range of .2 to .3. 
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Table 3 - Summary Statistics of Model Output 





Train 


Verify 


Test 


Data Mean 


73.42657 


82.89333 


71.03333 


Data S J). 


70.92945 


71J22581 


62.16501 


Error Mean 


-0.4771 


-7.2582 


7.440303 


Error SD. 


60.84374 


60.93211 


53.80157 


Coirelation 


0.514349 


0.51901 


0.503975 


Significance 


0.000 


0.000 


0.000 



Based on the correlation between prediction and the hold-out sets, the expected 
correlation between predictive model output and actual tenure for future applicants 
5 should be in the range of 0.5 . 

As described in the example, information theoretic feature selection was used to 
identify My-six bipdata and personality assessment item responses that were related to 
employee tenure in a sample of over two thousand employees at a national video rental 
chain. The data was collected via interactive electronic survey administration on a 

1 0 networlc of screen phones deployed in many regions of the U.S. 

A fully-connected, feed-forward backpropagation neural network was trained to 
produce an estimate of tenure in days using these fifty-six predictor variables (e.g., 
answers to the questions) as inputs. Network architectra-e consisted of 56 input neurons 
or nodes, a hidden layer of forty nodes and one output node. Conjugate gradient descent 

15 training resulted in converg^ce between training and test set minimum error in about 
300 iterative training exposures to the data. Model performance on an independent 
hold-out sample obtained a statistically significant correlation of .5 with actual tenure. 
These results are well within the range of acceptable performance for a criterion- 
referenced employee selection procedure and represent a significant improvement over 

20 many systems. 

In the example, based on information theoretic analysis, the responses to the 
questions shown in Tables 4 and 5 were deemed to be the most predictive. The 
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following descriptions are the questions in their entirety accompanied by the possible 
responses. 

To determine that these questions were the most predictive, infonnation 
theoretic analysis of the joint distribution of the response (alone or together with other 
5 responses) and the dependent variable, tenure, was peiformed. The nature of the 

relationship between a specific response and the Criterion variables may not be known, 
however the predictive success of the neural model suggests this relationship has, to 
some degree, been encoded in the wei^t matrix of the neural network. 



10 Table 4 - Pre-hire Content Examples 



1 

I • 


1-Tnw InnQ Ho vmi Til an to Qtsivixnfli fhiQ iriH i"fViiTPYl7 

flUW lUlig KXKJ yyJU jJlall WJ Olay WllU 11110 Jl^l/ 11 JJULLvU: 




1 — T.pQQ tliJin fi nnonfli*! 








3 - More than 1 vear 


2. 


Have you ever worked for this employer before? 




1-Yes 




2-No 


3. 


Reason for leaving? (if previously employed by this employer) 


4. 


Which type of position do you desire? 




1 - Store Director 




2 - Assistant Krector 




3 - Customer Service Representative 




4 -Shift Leader 




5 - Let's Discuss 


5. 


What do you expect to earn on an hourly basis? 




( hourly wage given ) 


6. 


Desired Schedule? 




I - Regular (not seasonal) 




2 - Seasonal 


7. 


Desired Hours? 




1 - Full time 




2 - Part time 


8. 


When would you be available to start? 




1 - Right Away (within the next day) 




2 - Specific Date (if not available to start within the next day) 


9. 


Hi^est Education Level? 




1 - 2 Years of College or Less: 




1 -Not indicated 




2 - Less IhanHS Graduate 
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3 - HS graduate or equivalent 

4 - Some college 

5 - Technical School 

6 - 2-year college degree 
2 - More than 2 years of college 

1 - Bachelor's level degree 

2 - Some graduate school 

3 - Masters level degree 

4 - Doctorate (academic) 

5 - Doctorate (professional) 

6 - Post-doctorate 

7 - Degree not completed 

8 - 2-year college degree 

10. What was your reason for leaving? (last job) 

1 - Voluntarily quit 

2 - Involuntarily terminated 
3 -Laid off 

4 - Still there 

1 1 . What was/is your job title? (last job) 

1 - Cashier 

2 - Stock person 

3 - Customer Service Representative 

4 - Management 
5 -Other 

12. Please describe the area you worked in. (last job) 

1 - Apparel 

2 - Inventory 

3 - Customer service 

4 - Food service 

5 - Operations 

6 - Computers/Electronics 

7 - Merchandising 

8 - Personnel 
9 -Other 

13. What was/is you supervisor's last name? 
(given or not given) 

14. May we contact this employer? 

1- Yes 

2- No 

15. What was your reason for leaving? (prior job) 

1 - Voluntarily quit 

2 - Involuntarily terminated 

3- Laidofr 
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4 - Still there 

16. What was/is your job title? (prior job) 
1 - Cashier 

2- Stock person 

3 - Customer Service Representative 

4 - Management 

5 - Other 

17. Please describe the area you worked in. (prior job) 

1 - Apparel 

2 - Inventory 

3 - Customer service 

4 - Food service 

5 - Operations 

6 - Computers/Electronics 

7 - Merchandising 

8 - Personnel 

9 - Other 

18. What was/is you supervisor's last name? (prior job) 
(given or not given) 

19. May we contact fins employer? (prior job) 

1- Yes 

2- No 

20. What was your reason for leaving? (prior to prior job) 

1 - Voluntarily quit 

2 - Involuntarily terminated 

3 - Laid off 
4 - Still there 

21. What was/is your job title? (prior to prior job) 
1 - Cashier 

2-StockpCTSon 

3 - Customer S^vice Representative 

4 - Management 

5 - Other 

22. Please describe the area you worked in. (prior to prior job) 

1 - Apparel 

2 - Inventory 

3 - Customer service 

4 - Food service 

5 - Operations 

6 - Computers/Electronics 

7 - Merchandising 

8 - Persoimel 

9 - Other 
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23. What was/is you supervisor's last name? (prior to prior job) 
(given or not given) 

24. May we contact this employer? (prior to prior job) 
1 - Yes 

2 -No 

25. Academic Recognitions? 
(listed or not listed) 

26. Other Recognitions? 
(listed or not listed) 

27. Have you previously applied for employment at tiiis employer? 

1- Yes 

2 - No 

28. Referral Source 

1 - Referred to this employer by Individual or Company 

1 - Agency 

2 -Client Referral 

3 - College Recruiting 

4 - Employee Referral 

5 - Former Employee 

6 - Executive Referral 

7 - Executive Search 

2 - Other Source of Referral 

1 - Advertisement 

2 - Job Fair 

3 - Job Posting 

4 - Open House 
5 -Other Source 

6 - Phone Inquiry 

7 - Unknown 

8 - Unsolicited 
9-WaIkIh 

29. Last name of referral 
(listed or not Usted) 

30. Any other commitments? 
(listed or not Usted) 

3 1 . Any personal commitments? 
(listed or not Usted) 



The possible responses to the question of Table 5 are as follows: "1 - It is 
definitely false or I strongly disagree, 2 - It is false or I disagree, 3 - It is true or I agree, 
4 - It is definitely true or I strongly agree." 
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Table 5 - Pre-hire Content Examples (e.g., Hourly Workers) 



1. You have confidence in yourself. 

2. You are always cheerful. 

3. You get mad at yourself when you make mistakes. 

4. You would rather work on a team than by yourself. 

5. You try to sense what others are thinking and feeling. 

6 . You can wait patiently for a long time, 

7 . When someone treats you badly, you ignore it. 

8. It is easy for you to feel what others are feeling. 

9. You keep cahn when under stress. 

10. You like to be alone. 

11. You like to talk a lot. 

12. You don't care what people think of you. 

13. You love to Usten to people talk about themselves. 

14. You always try not to hurt people's feelings. 

1 5 . There are some people you really can*t stand. 

16. People who talk all Ihe time are annoying. 

17. You are unsure of yourself with new people 

18. Slow people make you impatient. 

19. Other people's feelings are their own business. 

20. You change jfrom feeling happy to sad without any reason. 

21 . You criticize people when they deserve it. 

22. You ignore people you dont like. 

23. You have no big worries. 

24. When people make mistakes, you correct them. 

25. You could not deal with difficult people all day. 



Example 36 - Exemplary Implementation Using 
5 Information-Theoretic Feature Selection 

Information-theoretic feature selection can be used to choose appropriate inputs 
for a model. In the following example, the source for the data used to develop the 
model was a large national video rental company. The sample contains over 2000 cases, 
with 160 responses to application questions collected prior to hiring and tenure (in days) 
10 for former employees. The model was constructed to predict the length of employment 
for a given applicant, if hired. 
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The application itself consists of 77 bio-data questions (e.g., general, work 
related, infonnation, job history, education and referrals questions) and 83 psychometric 
questions. The psychometric assessment portion was designed to predict the reliability 
of an applicant in an hourly, customer service position. For the purposes of model 
S development, each question response was treated as a single feature and the reliability 
score was not provided to the neural network or feature selection process. 

While any information gathered during the £^Ucation process may have predictive 
value, the set of input variables (independent variables or "TVs") can be reduced. 
Possible justifications are as follows: 
10 1 . Not all potential Ws may have significant predictive value. The use of 

variables with little or no predictive value as inputs can add noise. Adding IVs to the 
model which cannot improve predictive capability may degrade prediction since the 
network may need to adapt to filter these inputs. This can result in additional training 
time and neural resources. 
15 2. Predictive models can provide a mapping fi-om an input space to an output 

space. The dimensdonality of this input space increases with the number of inputs. 
Thus, there are more parameters required to cover the mapping which in turn 
increases the variance of the model (in terms of the bias/variance dilemma); such a 
problem is sometimes referred to as the "curse of dimensionality." 
20 IVs with less predictive power can be eliminated in favor of a less complex 

neural network model by applying feature selection. Such methods fall into two general 
categories : filters and wrq>pers, either of which can be used. 

1. Wrappers can use the relationship between model performance and IVs 
directly by iteratively experimenting with IV subsets. Since the nature of the bias of 
25 the feature selection method matches that of the modeling technique, this approach 
can be theoretically optimal if the search is exhaustive. 

The exhaustive application of wrappers can be computationally overwhelming 
for most modeling problems since the number of possible subsets is 



-51- 



wo 02/13095 



CA 02417863 2003-01-30 



PCTAJSOl/24323 



where n is the total number of IVs and k is the cardinaUty of the subset of features. 

Additionally, there can be non-determinism within the modeling process, hi 
neural modeling, though training algorithms are typically deterministic, random 
5 initialization of the weight parameters varies the results of models developed with 
the same inputs. Therefore, even exhaustive trials may not prove conclusive with 
respect to estimating the predictive value of a set of features. 

2. Filters can analyze the relationship between sets of IVs and dependent 
variables (DVs) using methods independent of those used to develop the model. 
10 The bias of the filter may be incompatible with that of the modeling technique. 

For example, a filter may fail to detect certain classes of constraint, which the 
subsequent modeling stage may utilize. Conversely, the filter may identify relations 
which cannot be successfully modeled. Ideally, a filter can be completely inclusive in 
that no constraint which might be replicated by the subsequent modeling stage would be 
15 discarded. 

Information-theoretic feature selection can make use of the statistical theory of 
independent events. Events pupz, ^..^Pn are considered statistically independent if and 
only if the probability P, that they all occur on a given trial is 



20 The degree to which a joint distribution of probabihties diverges from the 

independence distribution maybe used as a measure of the statistical dependence of the 



events. 



Information-theoretic entropy can provide a convenient metric for quantifying 
the difference between distributions. The entropy, H{X) (measured in bits), of the 
25 distribution of a discrete random variable, with n states can be 

H{xh-tpilog2Pi 
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10 



15 



20 



where pi is the probability state z. 

Entropy can be maximized when a distribution is most uncertain. If a 
distribution is discrete,.this occurs when it is uniform. FIG. 17 shows a graph of tiie 
entropies of a single variable, 2-state distribution as the state probabilities vary. 

For a multivariate distribution constrained by fixed marginals, the distribution 
which maximizes entropy can be the indq)endence distribution (calculated as the 
product of the marginals). The distribution which minimizes entropy can be the 
distribution for which the variables are completely dependent. 

Dependence can be constraint between variables, so as constraint is reduced, 
entropy increases. Information-theoretic analysis can therefore be used to measure 
constraint. For a joint distribution of discrete variables, and F, the total entropy, 
i^OTcanbe 

H{XY)^-Y,Pii\og2Pij (10) 



where pij is the probabiUty of state ij occurring in the joint distribution of and 7, 
where i designates the state of X and j is the state of Y. The entropies of Zand /are 
computed with the marginals of the joint distribution 



(11) 



(12) 



Infonnation transmission (or "mutual infonnation") can be the measure of the distance 
between the independence and observed distrib\itions along the continuum discussed 
above. For AT and Y, T{X:Y) (the infonnation transmission between .ATand Y), is 
computed 



25 



T(X:Y) = H(X) + H(Vr ) - HCXY) (13) 
la a directed system, the measure of information transmission betwem the distribution 
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of an independent variable X and a dependent variable 7 is a gauge of the predictive 
value of Z H{X)'^HiY) = H{XY)]f?aid only if there is no constraint betweenXand 
Y, in which case Xwould be a poor predictor for Y, 

In order for a computed transmission value, T, to be considered an accurate 
5 measure of Misting constraint, the statistical significance of T for some confidence 
level, a, can be determined using the test The degrees of freedom for a 
transmission, T(X:Y), can be calculated 

dfn^f^f.-dfrdf, (") 

As the size of the joint distribution increases, so does the df for the significance 
10 ofthe transmission value. Since significance decreases as ^^Tincreases, the data 
reqtrirements for transmissions containing a large number of variables can quickly 
become overwhelming. 

A superior feature set can be determined. A goal can be to discover a subset S of 
the independent variables Fthat has the same predictive power as the entire set with 
1 5 respect to the dependent variables, D, 

T{V:D)^T{S:D) (15) 
The filtering process can therefore be guided by the following: 

1. if 1^ is any subset of F smaller than S, then T(S':D) is significantly smaller than 
r(S:D). 

20 2. if is any subset of V larger than S, then TCS'.'D) is not significantly larger than 
. T(S:D). 

Higher-order interactions are syn^gies between variables where the predictive 
power of a set of variables is significantly higher than that of the sum ofthe individual 
variables. In terms of information transmission for tbe IVs JTi, . . . , Xn, and dependent 
25 variable D, this is represented, 

T(Xi:D) '^-'+T(X„:D) <T(Xi, . . . , Xn:D) (16) 
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An illustration of this phenomenon among discrete binary variables: A, B and C, is 
shown by the contingency table in Tables 6A and 6B. 

Table 6A - Contingency Table for Distribution ABC, C=0 





B=0 


B=l 


A=0 


1/4 


0 


A=l 


0 


1/4 



S Table 6B - Contingenqr Table for Distribution ABC, C=l 





B=0 


B=l 


A=0 


0 


1/4 


A=I 


1/4 


0 



For the illustrated system, the following transmissions are computed: 
T(A:C)^ H(A) + H(C) - H(AC) ^0 bits 

T(B:C)^ H(B) -f H(C) - H(BC) =0 bits 

10 T(AB:C)^H(AB) + H(C) - H(ABC) =7 bit 



Knowledge o£A or B individually does not reduce the uncertainty of but 
knowledge of and B eliminates uncertainty since only one state of C is possible. With 
only first order transmissions values, A and B would not appear to be predictive 

1 5 features, when in fact, togeflier fliey are ideal. 

Higher order interactions were observed in the video clerk tenure data. Table 7 
lists the top ten single variable transmissions between the psychometric questions and 
tenure. Table 8 shows the top five, two and three variable transmissions. Each of the 
most predictive sets of questions (based on transmission values) in both the second and 

20 third order lists, T(q35 q73 :tenure) and T(q4 ql2 q39:tenure), contain only one question 
from the top ten most predictive questions based on first order transmissions. 
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Table 7 -Single Order Transmissions Between Psychometrics and Tenure 



variables 


trans. 


%H(DV) 


df 




T(q83:tenure) 


0.0168 


0.754 


27 


0.999 


T(q3:teoure) 


0.0140 


0.628 


27 


0.991 


T(q63:teniire) 


0.0135 


0.607 


27 


0.987 


T(q65:tenure) 


0.0133 


0.598 


27 


0.985 


T(q48:tenure) 


0.0133 


0.595 


27 


0.984 


T(q44:tenure) 


0.0132 


0.593 


27 


0.984 


T(q35:teraixe) 


0.0128 


0.573 


27 


0.977 


T(q21:tenure) 


0.0127 


0.569 


27 


0.975 


T(q8:tenure) 


0.0123 


0.553 


27 


0.967 


T(q69:tenure) 


0.0123 


0.552 


27 


0.966 



Table 8 - Higher (second and third) Order Transmissions between Psychometrics 



and Tenure 



variables 


trans. 


%H(DV) 


df 


Xfsig. 


T(q35 q73:t6mire) 


0.0593 


2.663 


135 


1.00 


T(q21 q83 -.tenure) 


0.0588 


2.639 


135 


1.00 


T(q39 q65:teniire) 


0.0585 


2.627 


135 


1.00 


T(q61 q70:tenure) 


0.0569 


2.553 


135 


0.999. 


T(q44q53:tenure) 


0.0567 


2.546 


135 


0.999 


T(q4 ql2 q39:temire) 


0.1808 


8.112 


567 


0.921 


T(ql0q39q65:tenure) 


0.1753 


7.864 


567 


0.811 


T(q4 q39 q44:tenure) 


0.1720 


7.718 


567 


0.712 


T(q4q39 q51:tenure) 


0.1718 


7.709 


567 


0.705 


T(q52 q61 q70:tenure) 


0.1717 


7.702 


567 


0.700 



5 

Such int^ctions can cc»iq>licate the search for the optimal set iS since tiie 
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members of Kmay not appear as powerful predictors in calculated transmissions using 
sets of features of cardinality less than |iS1 (the cardinality of the optimal subset S). 

Due to issues of ^ significance, it is frequently overwhelming to calculate 
significant transmission values for sets of variables of cardinality approaching |S|. 
5 Additionally, since the number of subsets of a given cardinality soon become very large, 
even if the significance issues were addressed, coniputational limitations would persist* 

In feature selection algorithms that ai^roximate an exhaustive search for S by 
computing only pairwise transmissions, higher-order interaction effects are not detected. 
Such methods may not accurately 2?>proximate S since only variables which are strong 
10 single variable predictors will be selected. 

Based on the following guidelines, hexiristics were applied in an effort to address 
the problems of combinatorics and significance in measuring higher-order relations. 

Althougji it is possible for members of the optimal subset of IVs, S, to be 
completely absent from all large lower order transmissions, this is probably unlikely. An 
15 omission can be increasingly unlikely as the order of the transmissions calculated 

approaches |S|. It is therefore likely that significant members of iS will appear in the top 
n transmissions of the highest order transmission computed, where n is sufficiently 
large. Thus, as n-* the union of the set of IVs appearing in the most predictive 
transmissions will probably ^roach S. 
20 With these guidelines, a process for generating an approximation to 5 (S 0 given 

the set V of significant TVs and the set D of all DVs, can be presented. 

In the following process (1-6), Tk will be used to denote the set of transmissions 
of order k (containing k IVs) from a set of n features. 

1. Calculate the transmissions, Tu for the highest order, k, for which the 



2. Choose the m unique transmissions of the greatest magnitude fiiom to be 
the base set for higher-order transmissions. 

3. Generate T'^y by adding the IV to numbers of Tk which generates the set T^i 




25 



transmissions maybe calculated. 
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with the largest transmission values. Note that is a subset of Ji+i since it 
contains only those members of Tk^\ which can be generated from J* by adding one 
independent variable to each transmission. 
4. Discard any duplicate transmissions. 
5 5. Repeat Steps 3 and 4 until significance is exhausted. 

6. Take the union of the variables appearing in as many of the most predictive 
transmissions as is necessary to generate a set of size |iS]. This union is the 
approximation of the set S. 
Since |5| is unknown, this value is estimated. However, 0 ^|iS| so it is often feasible 
10 to experiment with the S* for each cardinality. 

An issue raised by feature selection processes is the effect of dependence 
between members ofS \ This dependence may be viewed as the redundancy in the 
predictive content of the variables. One solution proposed is to calculate the pairwise 
transmissions Tfi V s between features s'i and s }, from a candidate S \ Features 
15 which exhibit high dependence (high pairwise transmissions) are penalized with respect 
to the likelihood of their inclusion in the jfinal S \ 

Dependence between features is dealt with implicitly in the process above since 
such dependence will reduce the entropy, thereby reducing the magnitude of the 
transmission between a set of features and the set of dependent variables. Highly 
20 . redundant feature sets will have low transmission values relative to less redundant sets 
of the same cardinality and will therefore be less likely to contribute to S 

While tenure in days is a discrete measure, the number of possible states makes 
it dif&cult to use the Variable without transformation since a large number of states 
makes the joint distribution sparse (high df relative to the data population) and any 
25 transmissions calculated statistically insignificant. Since tenmre is an ordered variable, 
applying a clustering algorithm was not problematic. 

Clustering is a form of compression, so care can be taken to minimize 
information loss. The clustering phase was guided by efforts to maximize the entropy 
of the clustered variable within the confines of the needs of statistical significance. 
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Though transmission values did vary across clustering algorithms and 
granularity, the results in terms of S ' were consistent. 

Transmissions were calculated by combining cluster analysis and information- 
theoretic analysis. For the video clerk data set (containing 160 IVs) it was decided that 
5 the cardinality of the sets of IVs for which transmissions could be calculated was 4. 
From there, two additional orders of cardinality were calculated by supplementing the 
4th order transmissions (as described in step 3 of the process). The union of indq>^dent 
variables appearing m the largest transmissions was taken to be iS^ Experimentation 
with neural models using S * of different cardinalities yielded the best results when \S *1 = 
10 56. 

An interesting aspect of the application questions chosen by the feature selection 
. method was the mix of bio-data and psychometrics. Of the 56 features used as inputs 

for the most successful model, 31 came jfrom the bio-data section of the application and 
. 25 came from the psychological assessment. Of particular interest was the "coupling" 
15 of certain bio-data and assessment questions. Such pairs would appear together 

throughout the analysis of transmission over a range of cardinalities, (e.g., they would 
appear as a highly predictive pair and would subsequently appear together in higjier- 
ordersetsoflVs), 

The synergistic effect between the two classes of question became apparent 
20 when models were generated using exclusively one class or the other (using only 

psychometrics or only bio-data questions). With comparable numbers of inputs, these 
models performed significantly worse than their more diverse counterparts. These 
results are particularly interesting since psydiological assessments typically do not 
include responses from such diverse classes of questions. 
25 M the example, the most successful neural model developed was a single hidden 

layer, feed-forward neural network with 56 mputs (|iS"| = 56), slid 40 hidden nodes. The 
network was trained using the conjugate gradient method. Of the total data set size of 
2084, 1784 were allocated to the training set and 300 were "hold-out". 

The performance measures of behavioral prediction models can be measured 
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using the correlation coefficient. For the neural model described, the correlation 
between prediction and actual tenure for the hold-out sample was p = 0.51 . For 
comparison, a number of other models were generated using either no feature selection 
or alternate feature selection methods. These models used the same network 
S architecture and training algorithm. The best model generated using the entire data set 
(e.g., all features), was a 160-90-1 configuration (160 inputs and 90 hidden layer nodes) 
which achieved a maximum hold-out correlation of p 0.44. Alternate feature selection 
algorithms: genetic algorithms, and forward and reverse stepwise regression, usmg the 
same number of features (56), failed to achieve a hold-out correlation better than p = 
10 0.47. 

Information-theoretic feature selection is a viable and accurate method of 
identifying predictors of job performance in employee selection. The capacity to 
identify non-linear and higjier-order interactions ignored by other feature selection 
methods represents a significant technique in constructing predictive models. 

15 

Alternatives 

It should be understood that the programs, processes, or methods described 
herein are not related or limited to any particular type of computer apparatus, unless 
indicated otherwise. Various types of general purpose or specialized computer 

20 apparatus may be used with or perform operations in accordance with the teachings 
described herein. Elements of the illustrated embodiment shown in software may be 
implemented in hardware and vice versa. In view of the many possible embodiments to 
which the principles of our invention may be applied, it should be recognized that tiie 
detailed embodiments are illustrative only and should not be taken as limiting the scope 

25 of our invention. Rather, we claim as our invention all such embodiments as may come 
within the scope and spirit of the following claims and equivalents thereto. 
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CLAIMS 

We Claim: 

1 . Aa apparatus for assisting in detennining the suitability of an individual 
for employment by an employer, the apparatus con]^rising: 

S an electronic data interrogator operable to present a iSrst set of a plurality of 

questions to the individual; 

an electronic answer c^turer operable to electronically store the individual's 
responses to at least a selected plurality of the first set of questions presented to the 
individual; 

10 an electronic predictor responsive to the stored answers and op^able to predict 

at least one post-hire outcome if the individual were to be employed by the employer, 
the predictor providing a prediction of the outcome based upon correlations of the stored 
answers with answers to sets of questions by other individuals for which post-hire 
information has been collected; and 

15 an electronic results provider providing an output indicative of the outcome to 

assist in determining the suitability of the individual for employment by the employer. 

2. An apparatus according to claim 1 wherein the post-hire outcome 
indicates whether the individual is predicted to be eligible for re-hire after termination. 

20 . 

3. An apparatus according to claim 1 wherein the post-hire outcome 
indicates whether the individual is predicted to be involuntarily terminated. 

4. An apparatus according to claim 1 wherein the post-hire outcomes 

25 indicate whether the individual is predicted to be involuntarily terminated and whether 
the individual is predicted to be eligible for re-hire after termination. 
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5. An apparatus according to claim 1 wherein at least one of the predicted 
outcomes is a predicted probability that a particular outcome value range will be 
observed. 

S ■ 6. An apparatus according to claim 1 wherein at least one of the predicted 
outcomes is a predicted value for a continuous variable. 

7. An apparatus according to claim 1 wh^ein the predicted outcome is a 
predicted range of values for a continuous variable. 

10 

8. An apparatus according to claim 1 wherein the predicted outcome 
iudicates whether the individual will belong to a particular group. 

9. An apparatus according to claim 1 wherein at least one of the predicted 
15 outcomes is a predicted ranking of the individual for the outcome. 

10. An apparatus according to claim 1 wherein at least one of the predicted 
outcomes indicates a predicted employment tenure for the individual. 

20 1 1 . An apparatus according to claim 1 wherein at least one of the predicted 

outcomes indicates a predicted numba: of accidents for the individual. 

12. An apparatus according to claim 1 wherein at least one of the predicted 
outcomes indicates a predicted sales level for the individual. 

25 

13. An apparatus according to claim 1 wherein the predictor comprises an 
artificial intelligence-based prediction system. 



-62- 



wo 02/13095 



CA 02417863 2003-01-30 



PCT/USOl/24323 



14. An apparatus according to claim 1 wherein the data interrogator is 
located at a first location and the predictor is located at a second location which is 
remote fix>m the first location. 

5 15. An apparatus according to claim 1 4 wherein the data interrogator and the 

predictor are selectively electronically interconnected through a network. 

16. An apparatus according to claim 15 wherein the network is the 
worldwide web. 

10 

17. An apparatus according to claim 15 wherein the network is a telephone 
network. 

18. An apparatus according to claim 1 5 wherein the network is a satellite 
15 network. 

19. An apparatus according to claim 1 wherein the first set of questions may 
be varied. 

20 20. An apparatus according to claim 1 9 wherein the predictor is operable to 

determine and indicate a lack of a correlation between one or more questions of the first 
set of questions and at least one of the predicted outcomes, whereby questions which 
lack the correlation may be discarded or modified. 



25 21 . An apparatus according to claim 1 wherein at least one of the predicted 

outcomes is longevity with an employer and the answers to sets of questions by other 
individuals comprise answers by employees of the employer for whom longevity has 
been determined. 
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22. An apparatus according to claim 1 in which the predictor comprises at 
least one model which provides a predictor of the probability of the individual 
exhibiting at least one of the predicted outcomes^ the model being based on correlations 
between the at least one of the predicted outcomes and the answers to questions by the 
5 other individuals, including answers by at least some employees of the employ^', the 
model taking at least selected answers of the stored answers as inputs to the model, a 
probability of the individual exhibiting the at least one of the predicted outcomes being 
provided as an output of the model 

10 23. An apparatus according to claim 22 wherein the model comprises at least 

one neural network. 

24. An apparatus according to claim 1 wherein the predictor is responsive to 
the stored answers and opemble to predict plural outcomes if the individual were to be 

1 5 employed by the employer. 

25. A method for assessing suitability of persons for employment based on 
information for hired employees, the method comprising: 

collecting pre-hire applicant information for hired employees before they are 

20 hired; 

collecting post-hire measures of the job effectiveness of hired employees; 

constmcting an artificial intelligence model identifying associations of patterns 
within the pre-bire data associated with patterns of job effectiveness in the post-hire 
data; 

25 collecting pre-hire information for a new applicant; and 

applying the artificial intelligence model to the pre-hire information for the new 
applicant to provide a prediction of the new applicant's suitability for employment. 



26. The method of claim 25 further comprising: 
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cjollecting post-hire information for the new applicant; and 
using at least the pre-hire and post-hire information for the new applicant to 
refine the artificial intelligence model. 

5 27- The metiiod of claim 25 fiirttier comprising: 

constructing at least one other artificial intelligence model of a different type; 

and 

assessing the relative effectivaiess of the artificial intelUgence models at 
predicting suitability of employees for employment based on actual enqiloyment 
1 0 effectiveness of employees hired based on the models. 

28. An apparatus for assisting in determining the suitability of an individual 
for employment by an employer, the apparatus comprising: 

means for electronically presenting a first set of a plurality of questions to the 
15 individual; 

means for electronically storing the individual's responses to at least a selected 
plurality of the first set of questions presented to the individual; 

responsive to the stored answers, means for predicting at least one post-hire 
outcome if the individual were to be employed by the employer, the means for 
20 predicting providmg a prediction of the outcome based upon correlations of the at least 
one characteristic with answers to sets of questions by other individuals and the 
closeness of the stored answers to such correlations; and 

means for providing an output indicative of the outcome to assist in determining 
the suitability of the individual for employment by the employer. 

25 
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29. An artificial intelligence-based system for predicting employee behaviors 
based on pre-hire information collected for the employee, the system comprising: 

an electronic device for presenting an employment aqpplication comprising a set 
of questions to .an employment candidate, wherein the electronic device is operable to 
5 transmit answers of the employment candidate to a central store of employee 

information, wherein the central store of employee information comprises information 
collected for a plurality of candidate employees and a plurality of hired employees; 

an artificial intelligence-based model constructed jQrom information collected 
firom the hired employees based on answers provided by the hired employees and 
10 employment behaviors observed for the faired employees; 

a software system for supplying the answers of the employment candidate to the 
artificial intelligence-based model to produce predicted employment behaviors for the 
employment candidate; and 

a report generator to produce a hiring recommendation report for the 
1 5 employment candidate based on die predicted employment behaviors of the employment 
candidate. 



30. A computer-implemented method of predicting employment 
perfomiance characteristics for a candidate employee based on pre-hire infonnation 
20 collected for hired employees, the method comprising: 

collecting data indicating pre-hire information for a plurality of the hired 
employees; 

collecting data indicating post-hire outcomes for the hired employees; 
constructing an artificial intelligence-based model fi^om the pre-hire infonnation 
25 and the post-hire outcomes for the employees; 

fiom the candidate employee, electronically collecting data indicating pre-hire 
information of the candidate employee; and 
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applying the model to the collected pre-hire mformation of the candidate 
employee to generate one or more predicted post-hire outcomes for the candidate 
employee. 

5 31. The method of claim 30 wherein collecting data fiom the candidate 

employee comprises electronically presenting a set of questions at an electronic device 
and electronically collecting answers to the questions at tiie electronic device. 

32. The method of claim 30 wherein the pre-hire information comprises one 
10 or more pre-hire characteristics and constructing the model comprises: 

identifying one or more pre-hire characteristics as ineffective predictors; and 
responsive to identLfying the pre-hire characteristics as ineffective predictors, 
omitting the ineflfective predictors from the model. 

15 33. The method of claim 30 further comprising: 

providmg a report indicating applicant flow. 

34. .The method of claim 30 wherein constmcting the model comprises: 
constmcting a plurahty of proposed models, wherein at least two of the models 

20 are of different types; and 

selecting a superior proposed model as the model to be used. 

35. The method of claim 34 wherein at least two of the proposed models are 
different neural network types. 

25 

36. The method of claim 35 wherein the two proposed models are both feed- 
forward neural networks. 
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37. The method of claim 35 wherein the two proposed models are chosen 
from the following: 

back propagation, conjugate gradients, quasi-Newton, Leveriberg-Marquardt, 
quick propagation, delta-bar-delta, linear, radial basis function, and generalized 
5 regression network. 

38. The method of claim 30 wherein at least one of the predicted post-hire 
outcomes is denoted as a probability that a particular value range of a job efBsctive 
measure will be observed for a candidate employee. 

10 

39. The method of claim 30 wherein at least one of the predicted post-hire 
' outcomes is denoted as a value for a continuous variable. 



40. The method of claim 30 wherein at least one of the predicted post-hire 
1 5 outcomes is denoted as a relative ranking for an outcome. 

41 . The method of claim 40 wh^ein the ranking is relative to other 
emplbymmt candidates. 

20 42. The method of claim 40 wherern the ranking is relative to the hired 

employees. 

43. The method of claim 30 further comprising: 

storing a relative importance of one or more particular post-hire outcomes; and 
25 generating automated hiring recommendations based on the predicted post-hire 

outcomes for the candidate employees and the importance of the post-hire outcomes. 

44. The method of claim 30 further comprising: 

refining the model based on newly-observed post-hire outcomes. 
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45. The method of claim 30 wherein the pre-hire information comprises 
answers to questions on a job application, the method further comprising: 

identifying one or more questions as ineffective predictors; 
5 responsive to identifying the questions as ineffective predictors, modifying the 

job ^phcation by removing the questions; 

coUecting new pre-hire information for additional candidate employees based on 
the modified job application; 

coUectiBg new post-hire information for the additional candidate employees; and 
1 0 constructing a refined artificial-iQtelligence model based on the additional pre- 

hire and post-hire information for the additional candidate employees. 

46. The method of claim 45 further comprising: 

responsive to determining pre-hire and post-hire information has been collected 
15 for a sufficient number of additional employees, providing an indication that a refined 
model can be constructed. 

47. The method of claim 45 further comprising: 

providing a report indicating the identified questions are ineffective predictors. 

20 

48. The metiiod of claim 45 further comprising: 

adding one or more new questions to the modified job appUcation before 
collecting additional pre-hire information. 

25 49. The method of claim 48 wherein the new questions are conq)osed based 

' on job skills appropriate for a particular job related to the job appUcation. 

50. The method of claim 48 further comprising: 
evaluating the effectiveness of the new questions. 
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51. An artificial intelligence-based employee performance prediction system 
comprising: 

a set of pre-hire characteristic identifiers; 
5 a set of post-hire outcome identifiers; 

a collection of data for employees, wherein the data includes values associated 
with the pre-hire identifiers and the post-hire identifiers; and 

an artificial intelligence-based model chosen from a set of candidate models, the 
artificial intelligence-based model exhibiting superior ability at predicting values 
1 0 associated with the post-hire outcome identifiers based on values associated with the 
pre-hire characteristic identifiers in comparison to the other candidate models. 

52. A computer-readable medium having a collection of employment-related 
data, the data comprising: 

1 5 pre-hire information for a plurality of employees, wherein the pre-hire 

information comprises information electronically-collected &om an applicant, wherein 

the information comprises a plurality of pre-hire characteristics; 

post-hire information for at least some of the plurality of raiployees, wherein the 

information comprises a plurality of post-hire outcomes; and 
20 a data structure identifying which of the pre-hire characteristics are effective in 

predicting a set of one or more of the post-hire outcomes for a job applicant 
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53. A method for providing an automated hiring recommendation for a new 
potential employee, the method comprising: 

collecting pxe-hire information for potential employees; 

storing the pre-hire information for the potential employees in a database; 
5 after hiring a plurality of the potential employees, collecting employment 

performance information for at least some of the hired employees; 

storing the employment performance information collected from the hired 
employees; 

constructing an artificial intelligence-based model based on correlations between 
10 the pre-hire information and the ecoployment performance information collected from 
one or more of the hired employees; 

collecting pre-hire information for a new potential employee; 
based on the artificial intelligence-based model, providing an automated hiring 
recommendation for the new potential employee; 
15 after hiring the new potential employee, collecting employment performance 

ioformation for the new potential employee; 

adding the employment performance information for the new potential employee 
to the database; and 

modifying the artificial intelligence-based model based on the pre-hire and 
20 employment performance information for the new potential employee. 
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54. A method for providing an automated hiring recommendation service for 
an employer, the method comprising: 

stationing a plurality of electronic devices at a plurality of employer sites, 
wherein the electronic devices are operable to accept directly from one or more job 
5 applicants answers to questions presented at the electronic devices; 

sending the answers of at least one of the job applicants to a remote site for 
analysis; 

applying an artificial intelligence-based predictive model to the answers of the 
least one of the job applicant to generate an automated hiring reconomendation; and 
10 automatically sending the hiriiig reconunendation to the employer. 

55. A method ofconstracting a model generating one or more job 
performance criteria predictors based on input pre-hire information, the method 
comprising; 

15 from a plurality of applicants, electronically collecting pre-hire information from 

the applicants; 

collecting post-hire information for the appUcants based on job performance of 
the applicants after hire; and 

from the pre-hire information and the post-hire information, generating an 
20 artificial inteUigence-based predictive model operable to generate one or more job 
performance criteria predictors based on input pre-hire iafoimation from new 
applicants. 



56. A computer-readable medium comprising computer-executable 
25 instructions for performing the method of claim 55. 

57. The method of claim 55 fiirther comprising: 

limiting the apphcants for the model to those from a particular geographic area; 

and 
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constructing the model as a geographically-specialized model. 
58. The method of claim 55 further comprising: 

limiting the applicants for the model to those with a particular educational level; 

S and 

constructing the model as an educational level-specialized model. 



59. The method of claim 55 further comprising; 

limiting the applicants for the model to those with a particular occupation; and 
1 0 constructing tixe model as an occupationally-specialized modeL 



60. The method of claim 55 wherein the model accepts one or more inputs, 
the method further comprising: 

identijfying in the pre-hire inforaiation one or more characteristics that are 
1 5 ineffective predictors; and 

omitting the ineffective predictors as inputs to the model. 

61 . The method of claim 55 wherein the pre-hire information comprises one 
or more characteristics, the method further comprising: 

20 identifying in the pre-hire information one or more characteristics that are 

ineffective predictors; and 

providing an indication that the characteristics no longer need to be collected. 



62. The method of claim 55 wherein job performance criteria predictors 
25 comprise a predictor indicating whether a job candidate will be voluntarily terminated. 

63 . The method of claim 55 wherein job performance criteria predictors 
comprise a predictor indicating whether a job candidate will be eligible for rehire after 

termination. 
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64. The method of claim 55 wherein the pre-hire information comprises one 
or more characteristics, the method furber comprising: 

identifying in the pre-hire information one or more characteristics that are 
5 ineffective predictors; and 

responsive to identifying the ineffective predictors, collecting new pre-hire 
information not including the ineffective predictors; and 

building a refined model based on the new pre-hire information. 

10 65. The mediod of cldm 64 further comprising: 

adding one or more new characteristics to be collected when collectmg the new 
pre-hire infonnation. 

66. The method of claim 65 further comprising: 

1 5 evaluating the effectiveness of the new characteristics. 

67. A method of constructing a model predicting employment performance 
based on a set of input employment parameters, the method comprising: 

selecting a set of input parameters indicating pre-hire characteristics of an 
20 enqjloyee, wherein the pre-hire characteristics are available before hiring the employee 
and are collected electronically from the employee; 

selecting a set of output parameters indicating post-hire outcomes available after 
hiring the employee; and 

training a neural network with the input and output parameters. 

25 
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68. The method of claim 67 further comprising: 

choosing a set of one or more candidate characteristics, wherein the 
characteristics indicate data available before hiring sq employee; 

testing effectiveness of the candidate characteristics in predicting the post-hire 
5 characteristics; and 

responsive to determining the candidate information is effective, incorporating 
the candidate information into the model. 



69. A method for constmcting an artificial intelligeace-based employment 
10 selection process based on pre-hire information comprising personal employee 
characteristics and post-hire information comprising employee job performance 
observation information, the method comprising: 

generating a ptoality of predictive artificial intelligence models based on the 
pre-hire and post-hire information, wherein at least two of the artificial intelligence 
15 models are of different types; 

testing effectiveness of the models to select an effective model; and 
applying the effective model to predict post-hire information not yet observed. 



70. The method of claim 69 wherein at least one of the models is a neural 
20 network. 



71. The method of claim 70 wherein at least one of the models is an expert 

system. 

25 72. The method of claim 69 wherein at least one of the models is a fuzzy 

logic system. 

73. The method of claim 69 wherein at least one of the models is an 
information theoretic model. 
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74. The method of claim 69 wherein at least one of the models is a neuro- 
fu2zy model. 

5 75. The method of claim 69 further comprising: 

identifying at least one of the models as exhibiting impermissible bias; and 
avoiding use of the models exhibiting impemnssible bias. 

76. The method of claim 75 wherein the impermissible bias is against a 
10 protected group of persons. 

77. A computer-implemented method of reiSning an artilBcial-intelligence 
based employee perfoimance selection system, the method comprising: 

collecting information via an electronic device presenting a set of questions to 
15 employment candidates, wherein the questions are stored in a computer-readable 
medium; 

testing effectiveness of at least one of the questions in predicting the post-hire 
information; and 

responsive to determining the question is ineffective, deleting the question firom 
20 the computer-readable medium. 

78. The method of claim 77 wherein effectiveness comprises predictiveness 
tested based on mformation theoretic techniques. 



-76- 



wo 02/13095 



CA 02417863 2003-01-30 



PCTAJSOl/24323 



79. A computer-readable medium comprising a predictive model, the model 
comprising: 

inputs for accepting one or more characteristics based on pre-hire infoimation 
for a job applicant; 

5 one or more predictive outputs indicating one or more predicted job 

effectiveness criteria based on the inputs, 

wherein the predictive model is an artificial intelligence-based model 

constructed fiom pro-hire data electronically collected from a plurality of employees and 

post-hire data, and the model generates its predictive outputs based on the similarity of 
10 the inputs to pre-hire data collected for the plurality of employees and their respective 

post-hire data. 

80. The computer-readable medium of claim 79 wherein the predictive 
model comprises a predictive output indicating a rank for the job appUcant. 

15 

8 1 . The computer-readable medium of claim 80 wherein the rank is relative 
to other appUcants. 

82. The computer-readable medium of claim 80 wherein the rank is relative 
20 to the plurality of employees. 

83. The computer-readable medium of claim 79 wherein the predictive 
model comprises a predictive output indicating probabiUty of group membership for the 
job appUcant. 

25 

84. The computer-readable medium of claim 79 wherein the predictive 
model comprises a predictive output indicating predicted tenure for the job applicant. 
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85. The computer-readable medium of claim 79 wherein the predictive 
model comprises a predictive output indicating predicted tenure for the job applicant 

86. The computer-readable medium of claim 79 wherein the predictive 

5 model comprises a predictive output indicating predicted number of accidents for the 
job applicant. 

87. The computer-readable medium of claim 79 wherein the predictive 
model comprises a predictive ou^ut indicating whether tiie applicant will be 

1 0 involuntarily terminated. 

8 8 . The computer-readable medium of claim 79 wherein the predictive 
. model comprises a predictive output indicating whether the applicant will be eligible for 
rehire after termination. 

15 

89. A computer-readable medium comprising a refined predictive model, the 
model comprising: 

inputs for accepting one or more characteristics based on pre-hire information 
for a job applicant; 

20 one or more predictive outputs indicating one or more predicted job 

efiEectiveness criteria based on the inputs, 

wherein the predictive model is constructed Scorn pre-hire data electronically 
collected from a plurality of employees and post-hire data, wherein the pre-hire data is 
based on a question set reJBned by having identified and removed one or more questions 

25 as ineffective. 

90. The computer-readable medium of claim 89 wherein the ineffective 
questions are identified via an information transfer technique. 
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9 1 . The computer-readable medium of claim 89 wherein the model is an 
artificial intelligence-based model. 

5 
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